Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django Pandas to http response (download file)

Python: 2.7.11

Django: 1.9

Pandas: 0.17.1

How should I go about creating a potentially large xlsx file download? I'm creating a xlsx file with pandas from a list of dictionaries and now need to give the user possibility to download it. The list is in a variable and is not allowed to be saved locally (on server).

Example:

df = pandas.DataFrame(self.csvdict)
writer = pandas.ExcelWriter('pandas_simple.xlsx', engine='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1')
writer.save()

This example would just create the file and save it where the executing script is located. What I need is to create it to a http response so that the user would get a download prompt.

I have found a few posts about doing this for a xlsxwriter but non for pandas. I also think that I should be using 'StreamingHttpResponse' for this and not a 'HttpResponse'.

like image 707
Adrian Z. Avatar asked Feb 08 '16 10:02

Adrian Z.


3 Answers

I will elaborate on what @jmcnamara wrote. This if for the latest versions of Excel, Pandas and Django. The import statements would be at the top of your views.py and the remaining code could be in a view:

import pandas as pd
from django.http import HttpResponse
try:
    from io import BytesIO as IO # for modern python
except ImportError:
    from io import StringIO as IO # for legacy python

# this is my output data a list of lists
output = some_function()
df_output = pd.DataFrame(output)

# my "Excel" file, which is an in-memory output file (buffer) 
# for the new workbook
excel_file = IO()

xlwriter = pd.ExcelWriter(excel_file, engine='xlsxwriter')

df_output.to_excel(xlwriter, 'sheetname')

xlwriter.save()
xlwriter.close()

# important step, rewind the buffer or when it is read() you'll get nothing
# but an error message when you try to open your zero length file in Excel
excel_file.seek(0)

# set the mime type so that the browser knows what to do with the file
response = HttpResponse(excel_file.read(), content_type='application/vnd.openxmlformats-officedocument.spreadsheetml.sheet')

# set the file name in the Content-Disposition header
response['Content-Disposition'] = 'attachment; filename=myfile.xlsx'

return response
like image 118
PlacidLush Avatar answered Oct 23 '22 17:10

PlacidLush


Jmcnamara is pointing you in the rigth direction. Translated to your question you are looking for the following code:

sio = StringIO()
PandasDataFrame = pandas.DataFrame(self.csvdict)
PandasWriter = pandas.ExcelWriter(sio, engine='xlsxwriter')
PandasDataFrame.to_excel(PandasWriter, sheet_name=sheetname)
PandasWriter.save()

sio.seek(0)
workbook = sio.getvalue()

response = StreamingHttpResponse(workbook, content_type='application/vnd.openxmlformats-officedocument.spreadsheetml.sheet')
response['Content-Disposition'] = 'attachment; filename=%s' % filename

Notice the fact that you are saving the data to the StringIO variable and not to a file location. This way you prevent the file being saved before you generate the response.

like image 10
MartinH Avatar answered Oct 23 '22 17:10

MartinH


With Pandas 0.17+ you can use a StringIO/BytesIO object as a filehandle to pd.ExcelWriter. For example:

import pandas as pd
import StringIO

output = StringIO.StringIO()

# Use the StringIO object as the filehandle.
writer = pd.ExcelWriter(output, engine='xlsxwriter')

# Write the data frame to the StringIO object.
pd.DataFrame().to_excel(writer, sheet_name='Sheet1')
writer.save()
xlsx_data = output.getvalue()

print len(xlsx_data)

After that follow the XlsxWriter Python 2/3 HTTP examples.

For older versions of Pandas you can use this workaround.

like image 2
jmcnamara Avatar answered Oct 23 '22 19:10

jmcnamara