Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create and stream a large archive without storing it in memory or on disk

Tags:

python

http

I want to allow users to download an archive of multiple large files at once. However, the files and the archive may be too large to store in memory or on disk on my server (they are streamed in from other servers on the fly). I'd like to generate the archive as I stream it to the user.

I can use Tar or Zip or whatever is simplest. I am using django, which allows me to return a generator or file-like object in my response. This object could be used to pump the process along. However, I am having trouble figuring out how to build this sort of thing around the zipfile or tarfile libraries, and I'm afraid they may not support reading files as they go, or reading the archive as it is built.

This answer on converting an iterator to a file-like object might help. tarfile#addfile takes an iterable, but it appears to immediately pass that to shutil.copyfileobj, so this may not be as generator-friendly as I had hoped.

like image 606
Nick Retallack Avatar asked May 01 '12 22:05

Nick Retallack


People also ask

Can a zip file be streamed?

It's not possible to completely stream-write ZIP files. Small bits of metadata for each member file, such as its name, must be placed at the end of the ZIP. In order to do this, stream-zip buffers this metadata in memory until it can be output.

Can you compress an archive?

Archival programs are used often to back up data. You would use archives to backup a folder or a number of files into a single file and compress them as well. This allows you to save space and then store that individual file on a floppy or other removable media.


1 Answers

I ended up using SpiderOak ZipStream.

like image 125
Nick Retallack Avatar answered Nov 10 '22 13:11

Nick Retallack