Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I programmatically create a tar archive of nested directories and files solely from Python strings and without temporary files?

I want to create a tar archive with a hierarchical directory structure from Python, using strings for the contents of the files. I've read this question , which shows a way of adding strings as files, but not as directories. How can I add directories on the fly to a tar archive without actually making them?

Something like:

archive.tgz:
    file1.txt
    file2.txt
    dir1/
        file3.txt
        dir2/
            file4.txt
like image 802
davidscolgan Avatar asked Dec 10 '22 03:12

davidscolgan


2 Answers

Extending the example given in the question linked, you can do it as follows:

import tarfile
import StringIO
import time

tar = tarfile.TarFile("test.tar", "w")

string = StringIO.StringIO()
string.write("hello")
string.seek(0)

info = tarfile.TarInfo(name='dir')
info.type = tarfile.DIRTYPE
info.mode = 0755
info.mtime = time.time()
tar.addfile(tarinfo=info)

info = tarfile.TarInfo(name='dir/foo')
info.size=len(string.buf)
info.mtime = time.time()
tar.addfile(tarinfo=info, fileobj=string)

tar.close()

Be careful with mode attribute since default value might not include execute permissions for the owner of the directory which is needed to change to it and get its contents.

like image 191
jcollado Avatar answered Jan 13 '23 13:01

jcollado


A slight modification to the helpful accepted answer so that it works with python 3 as well as python 2 (and matches the OP's example a bit closer):

from io import BytesIO
import tarfile
import time

# create and open empty tar file
tar = tarfile.open("test.tgz", "w:gz")

# Add a file
file1_contents = BytesIO("hello 1".encode())
finfo1 = tarfile.TarInfo(name='file1.txt')
finfo1.size = len(file1_contents.getvalue())
finfo1.mtime = time.time()
tar.addfile(tarinfo=finfo1, fileobj=file1_contents)

# create directory in the tar file
dinfo = tarfile.TarInfo(name='dir')
dinfo.type = tarfile.DIRTYPE
dinfo.mode = 0o755
dinfo.mtime = time.time()
tar.addfile(tarinfo=dinfo)

# add a file to the new directory in the tar file
file2_contents = BytesIO("hello 2".encode())
finfo2 = tarfile.TarInfo(name='dir/file2.txt')
finfo2.size = len(file2_contents.getvalue())
finfo2.mtime = time.time()
tar.addfile(tarinfo=finfo2, fileobj=file2_contents)

tar.close()

In particular, I updated octal syntax following PEP 3127 -- Integer Literal Support and Syntax, switched to BytesIO from io, used getvalue instead of buf, and used open instead of TarFile to show zipped output as in the example. (Context handler usage (with ... as tar:) would also work in both python2 and python3, but cut and paste didn't work with my python2 repl, so I didn't switch it.) Tested on python 2.7.15+ and python 3.7.3.

like image 27
teichert Avatar answered Jan 13 '23 14:01

teichert