Question

I am coding an Huffman string compressor and I would like to have a confirmation I am doing the optimal compression with my tree.

I am using this kind of tree:

enter image description here

Instead of this kinda tree:

enter image description here

I think that over 10 single characters, it's not possible to compress on 8 bits..

Is the first image really the optimal one?

Was it helpful?

Solution

The very basic idea is to add the two smallest nodes, creating a new node which value is the sum of its 2 children.

Respecting this rule up to the root of the tree guarantee that the tree produced will be optimal.

Therefore, you have no control on the shape of the tree : it entirely depends on the probability distribution of characters. It may end up being a degenerated tree (one branch per level) if the probability distribution looks like a Fibonacci serie.

Creating Huffman tree with a pre-set maximum depth is therefore more complex, and requires to break the usual rule of always adding the 2 smallest nodes. The resulting tree will obviously not be optimal.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top