How do you unzip very large files in python?

Tags:

Using python 2.4 and the built-in ZipFile library, I cannot read very large zip files (greater than 1 or 2 GB) because it wants to store the entire contents of the uncompressed file in memory. Is there another way to do this (either with a third-party library or some other hack), or must I "shell out" and unzip it that way (which isn't as cross-platform, obviously).

587

asked Dec 03 '08 23:12

Marc Novakowski

1 Answers

Here's an outline of decompression of large files.

import zipfile
import zlib
import os

src = open( doc, "rb" )
zf = zipfile.ZipFile( src )
for m in  zf.infolist():

    # Examine the header
    print m.filename, m.header_offset, m.compress_size, repr(m.extra), repr(m.comment)
    src.seek( m.header_offset )
    src.read( 30 ) # Good to use struct to unpack this.
    nm= src.read( len(m.filename) )
    if len(m.extra) > 0: ex= src.read( len(m.extra) )
    if len(m.comment) > 0: cm= src.read( len(m.comment) ) 

    # Build a decompression object
    decomp= zlib.decompressobj(-15)

    # This can be done with a loop reading blocks
    out= open( m.filename, "wb" )
    result= decomp.decompress( src.read( m.compress_size ) )
    out.write( result )
    result = decomp.flush()
    out.write( result )
    # end of the loop
    out.close()

zf.close()
src.close()

144

answered Oct 13 '22 06:10

S.Lott

Related questions
                            
                                In Rails Migrations, what does the number specified for :limit on an integer represent?
                            
                                Delete calendar event using iCalendar file import (Outlook 2003 problem)?
                            
                                What are the best Java code generation tools or plugins to use in Eclipse?
                            
                                Using SVN post-commit hook to update only files that have been committed
                            
                                How to use std::foreach with parameters/modification
                            
                                How to test an application for correct encoding (e.g. UTF-8)
                            
                                How to syntax highlight fragments of code in one language embedded in the source code in another language in Vim?
                            
                                Which embedded database capable of 100 million records has an efficient C or C++ API
                            
                                How do I mock a private field?
                            
                                Order in which command prompt executes files with the same name (a.bat vs a.cmd vs a.exe)
                            
                                how do I delete/gc an object in Actionscript 3?
                            
                                WPF - How can I make a brush that paints graph-paper-like squares?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With