<pre class="prettyprint"><code>a.zip--- -- b.txt -- c.txt -- d.txt </code></pre> Methods to process the zip files with Python, I could expand the zip file to a temporary directory, then process each txt file one bye one Here, I am more interested to know whether or not python provides such a way so that I don't have to manually expand the zip file and just simply treat the zip file as a specialized folder and process each txt accordingly.

The Python standard library helps you. Doug Hellman writes very informative posts about selected modules: https://pymotw.com/3/zipfile/ To comment on Davids post: From Python 2.7 on the Zipfile object provides a context manager, so the recommended way would be: <pre class="prettyprint"><code>import zipfile with zipfile.ZipFile("zipfile.zip", "r") as f: for name in f.namelist(): data = f.read(name) print name, len(data), repr(data[:10]) </code></pre> The <code>close</code> method will be called automatically because of the with statement. This is especially important if you write to the file.

How to process zip file with Python

Tags:

python

a.zip---
      -- b.txt
      -- c.txt
      -- d.txt

Methods to process the zip files with Python,

I could expand the zip file to a temporary directory, then process each txt file one bye one

Here, I am more interested to know whether or not python provides such a way so that I don't have to manually expand the zip file and just simply treat the zip file as a specialized folder and process each txt accordingly.

545

asked Sep 23 '11 19:09

q0987

2 Answers

The Python standard library helps you.

Doug Hellman writes very informative posts about selected modules: https://pymotw.com/3/zipfile/

To comment on Davids post: From Python 2.7 on the Zipfile object provides a context manager, so the recommended way would be:

import zipfile
with zipfile.ZipFile("zipfile.zip", "r") as f:
    for name in f.namelist():
        data = f.read(name)
        print name, len(data), repr(data[:10])

The close method will be called automatically because of the with statement. This is especially important if you write to the file.

123

answered Oct 02 '22 23:10

rocksportrocker

Yes you can process each file by itself. Take a look at the tutorial here. For your needs you can do something like this example from that tutorial:

import zipfile
file = zipfile.ZipFile("zipfile.zip", "r")
for name in file.namelist():
    data = file.read(name)
    print name, len(data), repr(data[:10])

This will iterate over each file in the archive and print out its name, length and the first 10 bytes.

The comprehensive reference documentation is here.

answered Oct 02 '22 22:10

David Heffernan

Related questions
                            
                                Generate a NumPy array with powers of 2
                            
                                How to query in AWS athena connected through S3 using lambda functions in python
                            
                                is it possible to retrain a previously saved keras model?
                            
                                How to move up n directories in Pythonic way?
                            
                                Do I have to do one-hot-encoding separately for train and test dataset? [closed]
                            
                                Slicing Dataframe column based on length of strings
                            
                                How to sort each list in a list of lists
                            
                                How to prevent every malicious file upload on my server? (check file type)?
                            
                                How can I have variable assertions in Perl?
                            
                                Full-featured date and time library
                            
                                Python's insert returning None?
                            
                                how code a function similar to itertools.product in python 2.5
                            
                                Selecting An Embedded Language
                            
                                What's wrong here? Iterating over a dictionary in Django template
                            
                                Why is this division not performed correctly?
                            
                                return a function object with parameter binded?
                            
                                Strange result in python
                            
                                Pip using system python osx
                            
                                Checking if something exists in items of list variable in Django template
                            
                                String replace vowels in Python?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With