Question

Let's say there's a.gz, and b.gz.

$ gzip_merge a.gz b.gz -output c.gz

I'd like to have this program. Of course,

$ cat a.gz b.gz > c.gz

doesn't work. Because the final DEFLATE block of a.gz has BFINAL, and the GZIP header of b.gz. (Refer to RFC1951, RFC1952) But if you unset BFINAL, throw away the second GZIP header and walk through the byte boundaries of the second gzip file, you can merge it.

In fact, I thought of writing an open source program for this matter, but didn't know how to publish it. So I asked the Joel to be my program manager, and I walked him through my explanation and defense, he finally understood what I wanted to do, but said he was too busy. :(

Of course, I could write one myself and try my way to publish it. But I can't do this alone because my day work belongs to the property of my employer.

Is there any volunteers? We could work as programmer(me), publisher(you) or programmer(you), publisher(me). All I need is some credit. I once implemented a Universal Decompressor Virtual Machine described in RFC3320. So I know this is feasible.

OR, you could point me to THAT program. It would be very useful for managing log files like merging 365 (day) gzipped log files to one. ;)

Thanks.

Was it helpful?

Solution

Of course, cat a.gz b.gz > c.gz doesn't work.

Actually, it works just fine. I just tested it. It's even documented (sort of) in the gzip man page.

   Multiple  compressed  files  can  be concatenated. In this case, gunzip
   will extract all members at once. For example:

         gzip -c file1  > foo.gz
         gzip -c file2 >> foo.gz

   Then

         gunzip -c foo

   is equivalent to

         cat file1 file2

OTHER TIPS

You could also:

zcat a.gz b.gz > c.txt && gzip c.txt

as long as your Linux/Unix distribution has zcat built in, which most of them do (and you could install it for the ones that do not.)

Alternatively:

zcat a.gz b.gz | gzip -c > c.txt.gz
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top