Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create zip archive for instant download

In a web app I am working on, the user can create a zip archive of a folder full of files. Here here's the code:

files = torrent[0].files
    zipfile = z.ZipFile(zipname, 'w')
    output = ""

    for f in files:
        zipfile.write(settings.PYRAT_TRANSMISSION_DOWNLOAD_DIR + "/" + f.name, f.name)

downloadurl = settings.PYRAT_DOWNLOAD_BASE_URL + "/" + settings.PYRAT_ARCHIVE_DIR + "/" + filename
output = "Download <a href=\"" + downloadurl + "\">" + torrent_name + "</a>"
return HttpResponse(output)

But this has the nasty side effect of a long wait (10+ seconds) while the zip archive is being downloaded. Is it possible to skip this? Instead of saving the archive to a file, is it possible to send it straight to the user?

I do beleive that torrentflux provides this excat feature I am talking about. Being able to zip GBs of data and download it within a second.

like image 353
Josh Hunt Avatar asked Jun 14 '09 10:06

Josh Hunt


2 Answers

Check this Serving dynamically generated ZIP archives in Django

like image 94
jitter Avatar answered Oct 04 '22 21:10

jitter


As mandrake says, constructor of HttpResponse accepts iterable objects.

Luckily, ZIP format is such that archive can be created in single pass, central directory record is located at the very end of file:

enter image description here

(Picture from Wikipedia)

And luckily, zipfile indeed doesn't do any seeks as long as you only add files.

Here is the code I came up with. Some notes:

  • I'm using this code for zipping up a bunch of JPEG pictures. There is no point compressing them, I'm using ZIP only as container.
  • Memory usage is O(size_of_largest_file) not O(size_of_archive). And this is good enough for me: many relatively small files that add up to potentially huge archive
  • This code doesn't set Content-Length header, so user doesn't get nice progress indication. It should be possible to calculate this in advance if sizes of all files are known.
  • Serving the ZIP straight to user like this means that resume on downloads won't work.

So, here goes:

import zipfile

class ZipBuffer(object):
    """ A file-like object for zipfile.ZipFile to write into. """

    def __init__(self):
        self.data = []
        self.pos = 0

    def write(self, data):
        self.data.append(data)
        self.pos += len(data)

    def tell(self):
        # zipfile calls this so we need it
        return self.pos

    def flush(self):
        # zipfile calls this so we need it
        pass

    def get_and_clear(self):
        result = self.data
        self.data = []
        return result

def generate_zipped_stream():
    sink = ZipBuffer()
    archive = zipfile.ZipFile(sink, "w")
    for filename in ["file1.txt", "file2.txt"]:
        archive.writestr(filename, "contents of file here")
        for chunk in sink.get_and_clear():
            yield chunk

    archive.close()
    # close() generates some more data, so we yield that too
    for chunk in sink.get_and_clear():
        yield chunk

def my_django_view(request):
    response = HttpResponse(generate_zipped_stream(), mimetype="application/zip")
    response['Content-Disposition'] = 'attachment; filename=archive.zip'
    return response
like image 28
Pēteris Caune Avatar answered Oct 04 '22 23:10

Pēteris Caune