I'm working in a memory constrained environment where I need to make archives of SQL dumps. If I use python's built in tarfile
module is the '.tar' file held in memory or written to disk as it's created?
For instance, in the following code, if huge_file.sql
is 2GB will the tar
variable take up 2GB in memory?
import tarfile
tar = tarfile.open("my_archive.tar.gz")), "w|gz")
tar.add('huge_file.sql')
tar.close()
No it is not loading it in memory. You can read the source for tarfile to see that it's using copyfileobj
, which is using a fixed size buffer to copy from the file to the tarball:
def copyfileobj(src, dst, length=None):
"""Copy length bytes from fileobj src to fileobj dst.
If length is None, copy the entire content.
"""
if length == 0:
return
if length is None:
shutil.copyfileobj(src, dst)
return
BUFSIZE = 16 * 1024
blocks, remainder = divmod(length, BUFSIZE)
for b in xrange(blocks):
buf = src.read(BUFSIZE)
if len(buf) < BUFSIZE:
raise IOError("end of file reached")
dst.write(buf)
if remainder != 0:
buf = src.read(remainder)
if len(buf) < remainder:
raise IOError("end of file reached")
dst.write(buf)
return
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With