Question

Under Huffman Encoding, if one character occurs more than 1/3rd of the time, is it guaranteed that there will be at least one character whose codeword is of length 1?

I thought of 2 cases where this might be true, but they're not exactly proofs -

Case 1: There is only one character which occurs more than 1/3rd of the time. In this case, a good huffman encoding scheme will ensure that this character corresponds to a single-character encoding, because there are no competing characters. Even though there may be characters that occur with high probability (but still less than 0.33), they must have length greater than 1 to ensure the prefix-free nature of the encoding.

Case 2: There are multiple characters that occur more than 33% of the time. We can also make the stronger assertion that in this case, there can only be at most 2 such characters. This can be further subdivided into cases where 1) there are a total of 2 characters in the encoding scheme, and 2) where there are more than 2 characters in the scheme. In case 1), both the characters can correspond to a single-character encoding scheme (0 and 1 for each respectively). In case 2), at most 1 of the 2 largely prevalent/frequent characters must have an encoding of length of 1 (since we're considering a good huffman encoding scheme) while the other n-1 characters have a length of at least 2.

I'm afraid that this is not the right approach. Could someone please help me out?

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top