Question

I have a .tgz file that contains multiple text files. I can read it in Python using the gzip module, and I see that the first line contains some information about the subsequent file, but it is unclear how I can properly iterate through the files in Python. I would like to be able to do something like:

for file in tgzFile:
  read file
  do stuff for file

I can read each line of the gzipped file, and I could attempt to identify the start of a file from the contents of the line, but I would prefer a cleaner method. Thanks.

Was it helpful?

Solution

import tarfile
tar = tarfile.open("file.tgz")
for file in tar.getmembers():
    print file.name

Tar.getmembers() returns a list of TarInfo objects which can be used accordingly. http://docs.python.org/2/library/tarfile.html#tarfile.TarInfo

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top