Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Untar archive in Python with errors

Tags:

python

I download a bz2 file using Python. Then I want to unpack the archive using:

def unpack_file(dir, file):
    cwd = os.getcwd()
    os.chdir(dir)
    print "Unpacking file %s" % file
    cmd = "tar -jxf %s" % file
    print cmd
    os.system(cmd)
    os.chdir(cwd)

Unfortunately this ends with error:

bzip2: Compressed file ends unexpectedly;
    perhaps it is corrupted?  *Possible* reason follows.
bzip2: Inappropriate ioctl for device
    Input file = (stdin), output file = (stdout)

It is possible that the compressed file(s) have become corrupted.
You can use the -tvv option to test integrity of such files.

You can use the `bzip2recover' program to attempt to recover
data from undamaged sections of corrupted files.

tar: Nieoczekiwany EOF w archiwum
tar: Nieoczekiwany EOF w archiwum
tar: Error is not recoverable: exiting now

However I can unpack the archive from the shell without any problem.

Do you have any ideas what I do wrong?

like image 760
Szymon Lipiński Avatar asked Dec 04 '22 05:12

Szymon Lipiński


1 Answers

For the record, python standard library ships with the tarfile module which automatically handles tar, tar.bz2, and tar.gz formats.

Additionally, you can do nifty things like get file lists, extract subsets of files or directories or chunk the archive so that you process it in a streaming form (i.e. you don't have to decompress the whole file then untar it.. it does everything in small chunks)

import tarfile
tar = tarfile.open("sample.tar.gz")
tar.extractall()
tar.close()
like image 81
synthesizerpatel Avatar answered Dec 28 '22 22:12

synthesizerpatel