Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sending multiple .CSV files to .ZIP without storing to disk in Python

I'm working on a reporting application for my Django powered website. I want to run several reports and have each report generate a .csv file in memory that can be downloaded in batch as a .zip. I would like to do this without storing any files to disk. So far, to generate a single .csv file, I am following the common operation:

mem_file = StringIO.StringIO()
writer = csv.writer(mem_file)
writer.writerow(["My content", my_value])
mem_file.seek(0)
response = HttpResponse(mem_file, content_type='text/csv')
response['Content-Disposition'] = 'attachment; filename=my_file.csv'

This works fine, but only for a single, unzipped .csv. If I had, for example, a list of .csv files created with a StringIO stream:

firstFile = StringIO.StringIO()
# write some data to the file

secondFile = StringIO.StringIO()
# write some data to the file

thirdFile = StringIO.StringIO()
# write some data to the file

myFiles = [firstFile, secondFile, thirdFile]

How could I return a compressed file that contains all objects in myFiles and can be properly unzipped to reveal three .csv files?

like image 925
Jamie Counsell Avatar asked Jul 31 '14 16:07

Jamie Counsell


1 Answers

zipfile is a standard library module that does exactly what you're looking for. For your use-case, the meat and potatoes is a method called "writestr" that takes a name of a file and the data contained within it that you'd like to zip.

In the code below, I've used a sequential naming scheme for the files when they're unzipped, but this can be switched to whatever you'd like.

import zipfile
import StringIO

zipped_file = StringIO.StringIO()
with zipfile.ZipFile(zipped_file, 'w') as zip:
    for i, file in enumerate(files):
        file.seek(0)
        zip.writestr("{}.csv".format(i), file.read())

zipped_file.seek(0)

If you want to future-proof your code (hint hint Python 3 hint hint), you might want to switch over to using io.BytesIO instead of StringIO, since Python 3 is all about the bytes. Another bonus is that explicit seeks are not necessary with io.BytesIO before reads (I haven't tested this behavior with Django's HttpResponse, so I've left that final seek in there just in case).

import io
import zipfile

zipped_file = io.BytesIO()
with zipfile.ZipFile(zipped_file, 'w') as f:
    for i, file in enumerate(files):
        f.writestr("{}.csv".format(i), file.getvalue())

zipped_file.seek(0)
like image 166
dwlz Avatar answered Oct 01 '22 00:10

dwlz