Question

I'm trying to encode arbitrary symbols into a bit string, and I don't really understand how I can either generate them or even decode a bit string containing those.

I want to work on arbitrary symbol to compress, I don't really know if an uniqulely decodable code is what I am looking for, maybe an arithmetic code or canonical huffman code ?

I just need a list of bit strings, describing the most frequent to least frequent, for any size of symbol table.

Was it helpful?

Solution

  1. Encoding:

    1. Generate Frequency table per symbol
    2. Generate Probability Tree based on this frequency table
    3. Generate Code Table - such that most frequent symbol gets the smallest bit string
    4. Output: [Frequency Table + Sequence of Bit Strings (put together end to end)]

      Important thing to note is that the sequence of these bit strings can be later segregated directly all by them selves. i.e. say [10010001] => {100, 1000, 1} (only an example)

  2. Decoding:

    1. Obtain Frequency table & Bit String sequence.
    2. Generate Probability Tree (Same as 1.2)
    3. Generate Code Table (Same as 1.3)
    4. Recreate Data:

      1. Parse Bit-String
      2. Match with the Code Table
      3. Output matched Symbol
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top