Question

I am using following standalone class to calculate size of zipped files before zipping. I am using 0 level compression, but still i am getting a difference of few bytes. Can you please help me out in this to get exact size?

Quick help will be appreciated.

import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.util.zip.CRC32;
import java.util.zip.ZipEntry;
import java.util.zip.ZipInputStream;
import java.util.zip.ZipOutputStream;

import org.apache.commons.io.FilenameUtils;


public class zipcode {

    /**
     * @param args
     */
    public static void main(String[] args) {
        // TODO Auto-generated method stub



         try {
             CRC32 crc = new CRC32();

                byte[] b = new byte[1024]; 
                File file = new File("/Users/Lab/Desktop/ABC.xlsx");
            FileInputStream in = new FileInputStream(file);
            crc.reset();
                // out put file 
                ZipOutputStream out = new ZipOutputStream(new FileOutputStream("/Users/Lab/Desktop/ABC.zip"));


                // name the file inside the zip  file 

                ZipEntry entry = new ZipEntry("ABC.xlsx");
                entry.setMethod(ZipEntry.DEFLATED);
                entry.setCompressedSize(file.length());
                entry.setSize(file.length());
                entry.setCrc(crc.getValue());
                out.setMethod(ZipOutputStream.DEFLATED);
                out.setLevel(0);
                //entry.setCompressedSize(in.available());
                //entry.setSize(in.available());
                //entry.setCrc(crc.getValue());


                out.putNextEntry(entry); 
                // buffer size

                int count;

                while ((count = in.read(b)) > 0) {
                    System.out.println();
                    out.write(b, 0, count);
                }
                out.close();
                in.close();         
        } catch (FileNotFoundException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }


    }

}
Était-ce utile?

La solution

Firstly, I'm not convinced by explanation for why you need to do this. There is something wrong with your system design or implementation if it is necessary to know the file size before you start uploading.

Having said that, the solution is basically to create the ZIP file on the server side so that you know its size before you start uploading it to the client:

  • Write the ZIP file to a temporary file and upload from that.

  • Write the ZIP file to an buffer in memory and upload from that.

If you don't have either the file space or the memory space on the server side, then:

  • Create "sink" outputStream that simply counts the bytes that are written to calculate the nominal file size.

  • Create / write the ZIP file to the sink, and capture the file size.

  • Open your connection for uploading.

  • Send the metadata including the file size.

  • Create / write the ZIP a second time, writing to the socket stream ... or whatever.

These 3 approaches will all allow you to create and send a compressed ZIP, if that is going to help.


If you insist on trying to do this on-the-fly in one pass, then you are going to need to read the ZIP file spec in forensic detail ... and do some messy arithmetic. Helping you is probably beyond the scope of a SO question.

Autres conseils

I had to do this myself to write the zip results straight to AWS S3 which requires a file size. Unfortunately there is no way I found to compute the size of a compressed file without performing the computation on each block of data.

One method is to zip everything twice. The first time you throw out the data but add up the number of bytes:

    long getSize(List<InputStream> files) throws IOException {
        final AtomicLong counter = new AtomicLong(0L);
        final OutputStream countingStream = new OutputStream() {
            @Override
            public void write(int b) throws IOException {
                counter.incrementAndGet();
            }
        };
        ZipOutputStream zoutcounter = new ZipOutputStream(countingStream);
        // Loop through files or input streams here and do compression
        // ...
        zoutcounter.close();
            
        return counter.get();
    }

The alternative is to do the above creating an entry for each file but then don't write any actual data (don't call write()) so you can compute the total size of just the zip entry headers. This will only work if you turn off compression like this:

entry.setMethod(ZipEntry.STORED);

The size of the zip entries plus the size of each uncompressed file should give you an accurate final size, but only with compression turned off. You don't have to set the CRC values or any of those other fields when computing the zip file size as those entries always have the same size in the final entry header. It's only the name, comment and extra fields on the ZipEntry that vary in size. The other entries like the file size, CRC, etc. take up the same space in the final zip file whether or not they were set.

There is one more solution you can try. Guess the size conservatively and add a safety margin, then compress it aggressively. Pad the rest of the file until it equals your estimated size. Zip ignores padding. If you implement an output stream that wrappers your actual output stream but implements the close operation as a noop then you can pass that as the output stream for your ZipOutputStream. After you close your ZipOutputStream instance, write the padding to the actual output stream to equal your estimated number of bytes, then close it for real. The file will be larger than it could be but you save the computation of the accurate file size and the result will benefit from at least some compression.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top