Question

I have a a directory with zip archives containing .jpg, .png, .gif images. I want to unzip each archive taking the images only and putting them in a folder with the name of the archive.

So:

files/archive1.zip
files/archive2.zip
files/archive3.zip
files/archive4.zip

Open archive1.zip - take sunflower.jpg, rose_sun.gif. Make a folder files/archive1/ and add the images to that folder, so files/archive1/folder1.jpg, files/archive1/rose_sun.gif. Do this to each archive.

I really don't know how this can be done, all suggestions are welcome. I have over 600 archives and an automatic solution would be a lifesaver, preferably a linux solution.

Was it helpful?

Solution

Something along the lines of:

#!/bin/bash
cd ~/basedir/files
for file in *.zip ; do
    newfile=$(echo "${file}" | sed -e 's/^files.//' -e 's/.zip$//')
    echo ":${newfile}:"
    mkdir tmp
    rm -rf "${newfile}"
    mkdir "${newfile}"
    cp "${newfile}.zip" tmp
    cd tmp
    unzip "${newfile}.zip"
    find . -name '*.jpg' -exec cp {} "../${newfile}" ';'
    find . -name '*.gif' -exec cp {} "../${newfile}" ';'
    cd ..
    rm -rf tmp
done

This is tested and will handle spaces in filenames (both the zip files and the extracted files). You may have collisions if the zip file has the same file name in different directories (you can't avoid this if you're going to flatten the directory structure).

OTHER TIPS

In Short

You can do this with a one-liner find + unzip.

find . -name "*.zip" -type f -exec unzip -jd "images/{}" "{}" "*.jpg" "*.png" "*.gif" \;

In Detail

unzip allows you to specify the files you want:

unzip archive.zip "*.jpg" "*.png" "*.gif"

And -d a target directory:

unzip -d images/ archive.zip "*.jpg" "*.png" "*.gif"

Combine that with a find, and you can extract all the images in all zips:

find . -name "*.zip" -type f -exec unzip -d images/ {} "*.jpg" "*.png" "*.gif" \;

Using unzip -j to junk the extraction of the zip's internal directory structure, we can do it all in one command. This gives you the flat image list separated by zip name that you desire as a one-liner.

find . -name "*.zip" -type f -exec unzip -jd "images/{}" "{}" "*.jpg" "*.png" "*.gif" \;

A limitation is that unzip -d won't create more than one new level of directories, so just mkdir images first. Enjoy.

7zip can do this, and has a Linux version.

mkdir files/archive1
7z e -ofiles/archive1/ files/archive1.zip *.jpg *.png *.gif

(Just tested it, it works.)

You can write a program using a zip library. If you do Mono, you can use DotNetZip.

The code would look like this:

foreach (var archive in listOfZips)
{
    using (var zip = ZipFile.Read(archive)
    {
        foreach (ZipEntry e in zip)
        {
            if (IsImageFile(e.FileName))
            {
                e.FileName = System.IO.Path.Combine(archive.Replace(".zip",""), 
                                  System.IO.Path.GetFileName(e.FileName));
                e.Extract("files");
            }
        }
    }
}

Perl's Archive-Zip is a good library for zipping/unzipping.

Here's my take on the first answer...

#!/bin/bash
cd files
for zip_name in *.zip ; do
    dir_name=$(echo "${zip_name}" | sed -e 's/^files.//' -e 's/.zip$//')
    mkdir ${dir_name}
    7z e -o${dir_name}/ ${zip_name} *.jpg *.png *.gif
done

or, if you'd just like to use the regular unzip command...

unzip -d ${dir_name}/ ${zip_name} *.jpg *.png *.gif

I haven't tested this, but it should work... or something along these lines. Definitely more efficient than the first solution. :)

Hope this helps!

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top