Is there a python module for regex matching in zip files

Tags:

I have over a million text files compressed into 40 zip files. I also have a list of about 500 model names of phones. I want to find out the number of times a particular model was mentioned in the text files.

Is there any python module which can do a regex match on the files without unzipping it. Is there a simple way to solve this problem without unzipping?

354

asked Aug 18 '08 07:08

cnu

1 Answers

There's nothing that will automatically do what you want.

However, there is a python zipfile module that will make this easy to do. Here's how to iterate over the lines in the file.

#!/usr/bin/python

import zipfile
f = zipfile.ZipFile('myfile.zip')

for subfile in f.namelist():
    print subfile
    data = f.read(subfile)
    for line in data.split('\n'):
        print line

answered Nov 03 '22 01:11

Mark Harrison

Related questions
                            
                                Building a table with the data from scratch Python
                            
                                How to build list of tasks for asyncio.gather in Python 3.8
                            
                                How to change the number or rows and columns in my seaborn catplot
                            
                                Why protobuf is smaller in memory than normal dict+list in python?
                            
                                How to call an api from another api in fastapi?
                            
                                avoid division by zero in numpy.where()
                            
                                install python 3.7 via google colab as default python
                            
                                How can I make a matplotlib plot in Google Colab interactive
                            
                                How to split cell in VSCode Jupyter Notebook?
                            
                                Invalid Syntax jose.py
                            
                                Most efficient way of adding elements given the index list in numpy
                            
                                sphinx warning: autosummary: stub file not found for the methods of the class. check your autosummary_generate settings
                            
                                How to design a neural network to predict arrays from arrays
                            
                                Pylance not working in VSCode Jupyter notebooks
                            
                                Pytorch CUDA error: no kernel image is available for execution on the device on RTX 3090 with cuda 11.1
                            
                                How to collate the arguments features to create a set of values from an Enum?
                            
                                How to save of the current state of a notebook in JupyterLab
                            
                                Is there a C/C++ equivalent for Python's "__init__.py"? [duplicate]
                            
                                Accept cookies consent from Youtube
                            
                                optimize function python dataframe

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Is there a python module for regex matching in zip files

Tags:

python

regex

zip

text-processing

cnu

People also ask

1 Answers

Mark Harrison

Recent Activity

Donate For Us