Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to walk a tar.gz file that contains zip files without extraction

Tags:

python

I have a large tar.gz file to analyze using a python script. The tar.gz file contains a number of zip files which might embed other .gz files in it. Before extracting the file, I would like to walk through the directory structure within the compressed files to see if certain files or directories are present. By looking at tarfile and zipfile module I don't see any existing function that allow me to get a table of content of a zip file within a tar.gz file.

Appreciate your help,

like image 866
JasonA Avatar asked Jul 20 '10 19:07

JasonA


People also ask

How do I unzip a gz file in Windows 10 without WinZip?

If you are using Windows 7, 8 or 10, follow the following steps to open any zip files without WinZip or WinRAR. Double click the zip file you wish to extract to open the file explorer. At the top part of the explorer menu, find “Compressed folder tools” and click it. Select the “extract” option that appears below it.

Can I use tar with zip files?

On the other hand, the zip format is an archiver, as well as a compressor. Choose tar if you need to archive files. Choose the zip you need to archive and compress files. You can choose tar format if you are working on a Linux system.


1 Answers

You can't get at it without extracting the file. However, you don't need to extract it to disk if you don't want to. You can use the tarfile.TarFile.extractfile method to get a file-like object that you can then pass to tarfile.open as the fileobj argument. For example, given these nested tarfiles:

$ cat bar/baz.txt     
This is bar/baz.txt.
$ tar cvfz bar.tgz bar
bar/
bar/baz.txt
$ tar cvfz baz.tgz bar.tgz
bar.tgz

You can access files from the inner one like so:

>>> import tarfile
>>> baz = tarfile.open('baz.tgz')
>>> bar = tarfile.open(fileobj=baz.extractfile('bar.tgz'))
>>> bar.extractfile('bar/baz.txt').read()
'This is bar/baz.txt.\n'

and they're only ever extracted to memory.

like image 105
Thomas Wouters Avatar answered Oct 29 '22 15:10

Thomas Wouters