LZW - Compression rate

https://stackoverflow.com/questions/23419550

13-07-2023
|

Question

Well, I have to make a PPM image compressor using the LZW algorithm, the algorithm and the code I understand and have implemented a version for strings (in Java,for tests).

The big problem is in compression, because if I have:

Input: ABCDABCDABCDABCD

Output: 65 66 67 68 256 258 260 259 257 68

as for my input I have 16 characters, if I just save my output as a text file, no compression because there are 34 characters. So I thought I'd save as binary file and then burn each field in a byte of my file, but there is the problem of fields >= 256. I also thought the idea of putting some fields set to occur when a value greater than 255 , something like that. :

(in this case I would remove the 255 field and 0 in my dictionary)

65 66 67 68 256 258 , would be:

65 66 67 68 255 1 0 255 3 0

then every field equal to 255 indicate an occurrence of consecutive bytes and sum would be the elements to be added; byte 0 indicates the end of sum.

The problem is that even so my output would be greater than the input and compression does not make sense. Then I would ask you if there is a way around this.

Solution

LZW has a predictable next data block size. So your encoder should save the number of bits required to save the current code. Your decoder should calculate the number of bits the next code could be and read that number of bits.

so your decoder would look something like

while (read next block){
   read next block bits size
   decode character block
   update dictionary
   next block bits size = dictionary next code bit size
}

Compression algorithms never work in the universal case. LZW is good for highly repeating data. With non repeating data the compressed data size will always be larger than the initial data size.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow