Question

Is there an implemetation of GZIPOutputStream that would do the heavy lifting (compressing + writing to disk) in a separate thread?

We are continuously writing huge amounts of GZIP-compressed data. I am looking for a drop-in replacement that could be used instead of GZIPOutputStream.

Was it helpful?

Solution

You can write to a PipedOutputStream and have a thread which reads the PipedInputStream and copies it to any stream you like.

This is a generic implementation. You give it an OutputStream to write to and it returns an OutputStream for you to write to.

public static OutputStream asyncOutputStream(final OutputStream out) throws IOException {
    PipedOutputStream pos = new PipedOutputStream();
    final PipedInputStream pis = new PipedInputStream(pos);
    new Thread(new Runnable() {
        @Override
        public void run() {
            try {
                byte[] bytes = new byte[8192];
                for(int len; (len = pis.read(bytes)) > 0;)
                    out.write(bytes, 0, len);
            } catch(IOException ioe) {
                ioe.printStackTrace();
            } finally {
                close(pis);
                close(out);
            }
        }
    }, "async-output-stream").start();
    return pos;
}

static void close(Closeable closeable) {
    if (closeable != null) try {
        closeable.close();
    } catch (IOException ignored) {
    }
}

OTHER TIPS

I published some code that does exactly what you are looking for. It has always frustrated me that Java doesn't automatically pipeline calls like this across multiple threads, in order to overlap computation, compression, and disk I/O:

https://github.com/lukehutch/PipelinedOutputStream

This class splits writing to an OutputStream into separate producer and consumer threads (actually, starts a new thread for the consumer), and inserts a blocking bounded buffer between them. There is some data copying between buffers, but this is done as efficiently as possible.

You can even layer this twice to do the disk writing in a separate thread from the gzip compression, as shown in README.md.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top