Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does Python's `tarfile` module store the archives it's building in memory?

I'm working in a memory constrained environment where I need to make archives of SQL dumps. If I use python's built in tarfile module is the '.tar' file held in memory or written to disk as it's created?

For instance, in the following code, if huge_file.sql is 2GB will the tar variable take up 2GB in memory?

import tarfile

tar = tarfile.open("my_archive.tar.gz")), "w|gz")
tar.add('huge_file.sql')
tar.close()
like image 742
Chris W. Avatar asked Mar 10 '11 19:03

Chris W.


1 Answers

No it is not loading it in memory. You can read the source for tarfile to see that it's using copyfileobj, which is using a fixed size buffer to copy from the file to the tarball:

def copyfileobj(src, dst, length=None):
    """Copy length bytes from fileobj src to fileobj dst.
       If length is None, copy the entire content.
    """
    if length == 0:
        return
    if length is None:
        shutil.copyfileobj(src, dst)
        return

    BUFSIZE = 16 * 1024
    blocks, remainder = divmod(length, BUFSIZE)
    for b in xrange(blocks):
        buf = src.read(BUFSIZE)
        if len(buf) < BUFSIZE:
            raise IOError("end of file reached")
        dst.write(buf)

    if remainder != 0:
        buf = src.read(remainder)
        if len(buf) < remainder:
            raise IOError("end of file reached")
        dst.write(buf)
    return
like image 82
Spike Gronim Avatar answered Sep 23 '22 03:09

Spike Gronim