python/zip: How to eliminate absolute path in zip archive if absolute paths for files are provided?

StackOverflow https://stackoverflow.com/questions/16091904

Question

I have two files in two different directories, one is '/home/test/first/first.pdf', the other is '/home/text/second/second.pdf'. I use following code to compress them:

import zipfile, StringIO
buffer = StringIO.StringIO()
first_path = '/home/test/first/first.pdf'
second_path = '/home/text/second/second.pdf'
zip = zipfile.ZipFile(buffer, 'w')
zip.write(first_path)
zip.write(second_path)
zip.close()

After I open the zip file that I created, I have a home folder in it, then there are two sub-folders in it, first and second, then the pdf files. I don't know how to include only two pdf files instead of having full path zipped into the zip archive. I hope I make my question clear, please help. Thanks.

Was it helpful?

Solution

The zipfile write() method supports an extra argument (arcname) which is the archive name to be stored in the zip file, so you would only need to change your code with:

from os.path import basename
...
zip.write(first_path, basename(first_path))
zip.write(second_path, basename(second_path))
zip.close()

When you have some spare time reading the documentation for zipfile will be helpful.

OTHER TIPS

I use this function to zip a directory without include absolute path

import zipfile
import os 
def zipDir(dirPath, zipPath):
    zipf = zipfile.ZipFile(zipPath , mode='w')
    lenDirPath = len(dirPath)
    for root, _ , files in os.walk(dirPath):
        for file in files:
            filePath = os.path.join(root, file)
            zipf.write(filePath , filePath[lenDirPath :] )
    zipf.close()
#end zipDir

I suspect there might be a more elegant solution, but this one should work:

def add_zip_flat(zip, filename):
    dir, base_filename = os.path.split(filename)
    os.chdir(dir)
    zip.write(base_filename)

zip = zipfile.ZipFile(buffer, 'w')
add_zip_flat(zip, first_path)
add_zip_flat(zip, second_path)
zip.close()

You can override the filename in the archive with the arcname parameter:

with zipfile.ZipFile(file="sample.zip", mode="w", compression=zipfile.ZIP_DEFLATED) as out_zip:
for f in Path.home().glob("**/*.txt"):
    out_zip.write(f, arcname=f.name)

Documentation reference: https://docs.python.org/3/library/zipfile.html#zipfile.ZipFile.write

Can be done that way also (this allow for creating archives >2GB)

import os, zipfile
def zipdir(path, ziph):
    """zipper"""
    for root, _, files in os.walk(path):
        for file_found in files:
            abs_path = root+'/'+file_found
            ziph.write(abs_path, file_found)
zipf = zipfile.ZipFile(DEST_FILE.zip, 'w', zipfile.ZIP_DEFLATED, allowZip64=True)
zipdir(SOURCE_DIR, zipf)
zipf.close()

As João Pinto said, the arcname argument of ZipFile.write is what you need. Also, reading the documentation of pathlib is helpful. You can easily get the relative path to something also with pathlib.Path.relative_to, no need to switch to os.path.

import zipfile
from pathlib import Path

folder_to_compress = Path("/path/to/folder")
path_to_archive = Path("/path/to/archive.zip")

with zipfile.ZipFile(
        path_to_archive,
        mode="w",
        compression=zipfile.ZIP_DEFLATED,
        compresslevel=7,
    ) as zip:
    for file in folder_to_compress.rglob("*"):
        relative_path = file.relative_to(folder_to_compress)
        print(f"Packing {file} as {relative_path}")
        zip.write(file, arcname=relative_path)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top