سؤال

My question is a follow up to this one. I would like to know how I can modify the following code so that I can assign a compression level:

import os
import tarfile

home = '//global//scratch//chamar//parsed_data//batch0'
backup_dir = '//global//scratch//chamar//parsed_data//'

home_dirs = [ name for name in os.listdir(home) if os.path.isdir(os.path.join(home, name)) ]

for directory in home_dirs:
    full_dir = os.path.join(home, directory)
    tar = tarfile.open(os.path.join(backup_dir, directory+'.tar.gz'), 'w:gz')
    tar.add(full_dir, arcname=directory)
    tar.close()

Basically, what the code does is that I loop through each directory in batch0 and compress each directory (where in each directory there are 6000+ files) and create a tar.gz compressed file for each directory in //global//scratch//chamar//parsed_data//. I think by default the compression level is = 9 but it takes a lot of time to compressed. I don't need a lot of compression. A level 5 would be enough. How can I modify the above code to include a compression level?

هل كانت مفيدة؟

المحلول 2

There is a compresslevel option in the gzopen method. The line below should replace the one with the tarfile.open call in your example:

tar = tarfile.TarFile.gzopen(os.path.join(backup_dir, directory+'.tar.gz'), mode='w', compresslevel=5)

نصائح أخرى

There is a compresslevel attribute you can pass to open() (no need to use gzopen() directly):

tar = tarfile.open(filename, "w:gz", compresslevel=5)

From the gzip documentation, compresslevel can be a number between 1 and 9 (9 is the default), 1 being the fastest and least compressed, and 9 being the slowest and most compressed.

[See also: tarfile documentation]

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top