Trying to generate a uniquely decodable code and decode it
-
06-07-2021 - |
Question
I'm trying to encode arbitrary symbols into a bit string, and I don't really understand how I can either generate them or even decode a bit string containing those.
I want to work on arbitrary symbol to compress, I don't really know if an uniqulely decodable code is what I am looking for, maybe an arithmetic code or canonical huffman code ?
I just need a list of bit strings, describing the most frequent to least frequent, for any size of symbol table.
Solution
Encoding:
- Generate Frequency table per symbol
- Generate Probability Tree based on this frequency table
- Generate Code Table - such that most frequent symbol gets the smallest bit string
Output: [Frequency Table + Sequence of Bit Strings (put together end to end)]
Important thing to note is that the sequence of these bit strings can be later segregated directly all by them selves. i.e. say [10010001] => {100, 1000, 1} (only an example)
Decoding:
- Obtain Frequency table & Bit String sequence.
- Generate Probability Tree (Same as 1.2)
- Generate Code Table (Same as 1.3)
Recreate Data:
- Parse Bit-String
- Match with the Code Table
- Output matched Symbol