Python: 2.7.11
Django: 1.9
Pandas: 0.17.1
How should I go about creating a potentially large xlsx file download? I'm creating a xlsx file with pandas from a list of dictionaries and now need to give the user possibility to download it. The list is in a variable and is not allowed to be saved locally (on server).
Example:
df = pandas.DataFrame(self.csvdict)
writer = pandas.ExcelWriter('pandas_simple.xlsx', engine='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1')
writer.save()
This example would just create the file and save it where the executing script is located. What I need is to create it to a http response so that the user would get a download prompt.
I have found a few posts about doing this for a xlsxwriter but non for pandas. I also think that I should be using 'StreamingHttpResponse' for this and not a 'HttpResponse'.
I will elaborate on what @jmcnamara wrote. This if for the latest versions of Excel, Pandas and Django. The import statements would be at the top of your views.py and the remaining code could be in a view:
import pandas as pd
from django.http import HttpResponse
try:
from io import BytesIO as IO # for modern python
except ImportError:
from io import StringIO as IO # for legacy python
# this is my output data a list of lists
output = some_function()
df_output = pd.DataFrame(output)
# my "Excel" file, which is an in-memory output file (buffer)
# for the new workbook
excel_file = IO()
xlwriter = pd.ExcelWriter(excel_file, engine='xlsxwriter')
df_output.to_excel(xlwriter, 'sheetname')
xlwriter.save()
xlwriter.close()
# important step, rewind the buffer or when it is read() you'll get nothing
# but an error message when you try to open your zero length file in Excel
excel_file.seek(0)
# set the mime type so that the browser knows what to do with the file
response = HttpResponse(excel_file.read(), content_type='application/vnd.openxmlformats-officedocument.spreadsheetml.sheet')
# set the file name in the Content-Disposition header
response['Content-Disposition'] = 'attachment; filename=myfile.xlsx'
return response
Jmcnamara is pointing you in the rigth direction. Translated to your question you are looking for the following code:
sio = StringIO()
PandasDataFrame = pandas.DataFrame(self.csvdict)
PandasWriter = pandas.ExcelWriter(sio, engine='xlsxwriter')
PandasDataFrame.to_excel(PandasWriter, sheet_name=sheetname)
PandasWriter.save()
sio.seek(0)
workbook = sio.getvalue()
response = StreamingHttpResponse(workbook, content_type='application/vnd.openxmlformats-officedocument.spreadsheetml.sheet')
response['Content-Disposition'] = 'attachment; filename=%s' % filename
Notice the fact that you are saving the data to the StringIO variable and not to a file location. This way you prevent the file being saved before you generate the response.
With Pandas 0.17+ you can use a StringIO/BytesIO object as a filehandle to pd.ExcelWriter
. For example:
import pandas as pd
import StringIO
output = StringIO.StringIO()
# Use the StringIO object as the filehandle.
writer = pd.ExcelWriter(output, engine='xlsxwriter')
# Write the data frame to the StringIO object.
pd.DataFrame().to_excel(writer, sheet_name='Sheet1')
writer.save()
xlsx_data = output.getvalue()
print len(xlsx_data)
After that follow the XlsxWriter Python 2/3 HTTP examples.
For older versions of Pandas you can use this workaround.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With