Question

I have the following data from a photoshop file that uses zip-compression (RFC1951):

250, 255, 159, 1, 47, 248, 63, 42, 63, 172, 229, 1, 2, 12, 0, 209, 255, 31, 225

Which decompresses to the following, x16:

255, 255, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

Re-compressing this gives me:

251, 255, 159, 1, 47, 248, 63, 42, 63, 172, 229, 1

Why isn't this exactly the same as the original input?

(originally posted on codeplex but got no answers: http://dotnetzip.codeplex.com/discussions/406943)

Was it helpful?

Solution

First, to get the terminology right, RFC 1951 is the deflate format (which your data is), not "zip-compression". zip can use deflate, but the deflate data is then wrapped with zip headers, trailers, and a directory.

Second, in general there is never any assurance that decompression-compression will always give you the same thing. Most compressors have different levels of compression and other options that can give different compressed output for the same input. The only thing guaranteed by a lossless compressor is that compression-decompression will give you the same thing.

For your particular example, the first compressor threw in some extraneous empty blocks (two of them). That deflate stream disassembled:

static
literal 255 255 0
match 29 1
literal 255
match 258 32
match 221 32
end
!
static
end
!
last
static
end

The second compressor did not include the extraneous empty blocks:

last
static
literal 255 255 0
match 29 1
literal 255
match 258 32
match 221 32
end
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top