Question

I'm using:

java.util.zip

I have a while loop reading the buffer until it's clear. I'm reading 2 or more files from a folder but i want something faster. I want to use Threads. If i use a thread for every file then when i unzip a 1GB file i'm not going to see any difference when unzipping smaller files too.

How can i share that job with Threads? I can't read the stream from different Threads (can i?).

Was it helpful?

Solution

The average today’s computer handles zip decompression much faster than a harddisk can provide the data. This applies to most SSDs as well as the bus is the limiting factor.

So any attempt to speed up the process by changing the CPU utilization will fail. The best thing you can do is to separate reading and writing which might add a gain if source and target are on different devices.

Or to make the processing after the decompression multi-threaded. But if you are just reading and dropping the data there is no way to accelerate the process significantly.

OTHER TIPS

If you want multiple threads to read from the same zipfile in a scalable way, you must open one ZipFile instance per thread. That way, the per-thread lock in the ZipFile methods does not block all but one thread from reading from the zipfile at one time. It also means that when each thread closes the ZipFile after they're done reading, they close their own instance, not the shared instance, so you don't get an exception on the second and subsequent close.

Protip: if you really care about speed, you can get more performance by reading all the ZipEntry objects from the first ZipFile instance, and sharing them with all threads, to avoid duplicating work in reading ZipEntry objects for each thread separately. A ZipEntry object is not tied to a specific ZipFile instance per se, ZipEntry just records metadata that will work with any ZipFile object representing the same zipfile that the ZipEntry came from.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top