Question

I am using python tarfile module to extract files from a *.tgz file. Here what I use:

import tarfile
tar = tarfile.open("some.tar")
tar.extractall(".")
tar.close()

Assume "some.tar" contents as:

-a.txt ===> user:usr1 , group: grp1
-b.txt ===> user:usr2 , group: grp2

But after extracting I lose all of user,group,date... information. They now belong to whoever calls the script(in my case root). They become like:

-a.txt ===> user:root , group: root
-b.txt ===> user:root , group: root

Is there a way to keep file owner,date information of files?

From tarfile module page:

-handles directories, regular files, hardlinks, symbolic links, fifos, character devices and block devices and is able to acquire and restore file information like timestamp, access permissions and owner.

From this statement I understand that is is very well possible to do this by "tarfile" module, or do I understand it wrong?

Python version is 2.6.1

Edit: I am running this script as root

Thanks

Was it helpful?

Solution

As guettli says, you have to be root to be able to change the ownership of a file to somebody else. Otherwise, you open a huge security hole. This is true when using the tar(1) program or when trying to use the tarfile package from python.

Note, though, that some earlier version of Python have a bug (see issue in comments below) that means files extracted by root are owned by root instead of the real owner (user and group).

OTHER TIPS

First, your script needs to run as root (on unix like systems). Otherwise, you can't use chown.

You need to get the TarInfo object for the files:

http://docs.python.org/library/tarfile.html#tarfile.TarInfo

There you get uid (user id) and gid (group id) and (or user name).

Then you need to use chown.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top