DEFLATE Encoding with static Huffman Codes

https://stackoverflow.com/questions/17398931

02-06-2022
|

質問

need some help to understand how DEFLATE Encoding works. I know that is a combination of the LZSS algorithm and Huffman coding.

So let encode for example "Deflate late". Params: [Search buffer: 8kb and Look-ahead buffer 4kb] Well, the output of LZSS algorithm is "Deflate <5, 4>" The next step uses static huffman coding to reduce the redundancy. Here is my problem, I dont know how should i encode this pair <5, 4> with huffman.

[Edited]

D 000
f 001
l 010
a 011
t 100
_ 101
e 11

So well, according to this table the string "Deflate " is written as 000 11 001 010 011 100 11 101. As a next step lets encode the pair (5, 4). The fixed prefix code of the length 4 according to the book "Data Compression - The Complete Reference" is 258, followed by fixed prefix code of the distance 5 (Code 4 + 1 Extra bit).

That can be summarized as:

length 4 -> 258 -> 0000010
distance 5 -> 4 + 1 extra bit -> 00100|0

So, the encoded string is written as [header: 1 01] 000 11 001 010 011 100 11 101 0000010 001000 [end-of-block: 0000000], BUT if i create a huffman tree, it is not a static huffman anymore, right?

Good day

解決

is not the Deflate static code. The static literal/length codes are all 7, 8, or 9 bits, and the distance codes are all 5 bits. You asked about the static codes.

'Deflate late' encoded in static deflate format as the literals 'Deflate ' and a length 4, distance 5 match in hex is:

73 49 4d cb 49 2c 49 55 00 11 00

That is broken down as follows (bits are read from the least significant part of each byte first):

011 - 01 means fixed code, 1 means last block
00101110 - D
10101001 - e
01101001 - f
00111001 - l
10001001 - a
00100101 - t
10101001 - e
00001010 - space
0100000 - length 4
00100 - distance 5 or 6 depending on one extra bit
0 - extra bit -> distance 5
0000000 - end code
0 - fill bit to byte boundary

ライセンス： CC-BY-SA と帰属

所属していません StackOverflow