Read .tar.gz file in Python

Tags:

I have a text file of 25GB. so i compressed it to tar.gz and it became 450 MB. now i want to read that file from python and process the text data.for this i referred question . but in my case code doesn't work. the code is as follows :

import tarfile import numpy as np   tar = tarfile.open("filename.tar.gz", "r:gz") for member in tar.getmembers():      f=tar.extractfile(member)      content = f.read()      Data = np.loadtxt(content)

the error is as follows :

Traceback (most recent call last):   File "dataExtPlot.py", line 21, in <module>     content = f.read() AttributeError: 'NoneType' object has no attribute 'read'

also, Is there any other method to do this task ?

554

asked May 27 '16 04:05

KrunalParmar

1 Answers

The docs tell us that None is returned by extractfile() if the member is a not a regular file or link.

One possible solution is to skip over the None results:

tar = tarfile.open("filename.tar.gz", "r:gz") for member in tar.getmembers():      f = tar.extractfile(member)      if f is not None:          content = f.read()

answered Sep 20 '22 13:09

Raymond Hettinger

Related questions
                            
                                Neural Network training with PyBrain won't converge
                            
                                Can you create a Python list from a string, while keeping characters in specific keywords together?
                            
                                Pandas: append dataframe to another df
                            
                                module 'matplotlib' has no attribute 'verbose'
                            
                                Glade or no glade: What is the best way to use PyGtk?
                            
                                How to retrieve a variable's name in python at runtime?
                            
                                Searching a sorted list? [closed]
                            
                                remove colorbar from figure in matplotlib
                            
                                When to use == and when to use is?
                            
                                Python: avoiding if condition for this code?
                            
                                Valid characters in a python class name
                            
                                raise statement on a conditional expression
                            
                                Which is the most efficient way to iterate through a list in python?
                            
                                SciPy/Python install on Ubuntu
                            
                                How do you join two tables on a foreign key field using django ORM?
                            
                                How to install python packages without root privileges?
                            
                                check for file existence in Python 3 [duplicate]
                            
                                Finding k closest numbers to a given number
                            
                                Will pandas dataframe object work with sklearn kmeans clustering?
                            
                                How to check text file exists and is not empty in python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Read .tar.gz file in Python

Tags:

python

file

gzip

tar

KrunalParmar

People also ask

1 Answers

Raymond Hettinger

Recent Activity

Donate For Us