My question is a follow up to this one. I would like to know how I can modify the following code so that I can assign a compression level:
import os
import tarfile
home = '//global//scratch//chamar//parsed_data//batch0'
backup_dir = '//global//scratch//chamar//parsed_data//'
home_dirs = [ name for name in os.listdir(home) if os.path.isdir(os.path.join(home, name)) ]
for directory in home_dirs:
full_dir = os.path.join(home, directory)
tar = tarfile.open(os.path.join(backup_dir, directory+'.tar.gz'), 'w:gz')
tar.add(full_dir, arcname=directory)
tar.close()
Basically, what the code does is that I loop through each directory in batch0 and compress each directory (where in each directory there are 6000+ files) and create a tar.gz compressed file for each directory in //global//scratch//chamar//parsed_data//.
I think by default the compression level is = 9 but it takes a lot of time to compressed. I don't need a lot of compression. A level 5 would be enough. How can I modify the above code to include a compression level?
There is a compresslevel attribute you can pass to open() (no need to use gzopen() directly):
tar = tarfile.open(filename, "w:gz", compresslevel=5)
From the gzip documentation, compresslevel can be a number between 1 and 9 (9 is the default), 1 being the fastest and least compressed, and 9 being the slowest and most compressed.
[See also: tarfile documentation]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With