LZW decoding miss the first code entry

https://stackoverflow.com/questions/16444157

java
lzw

14-04-2022
|

Question

I followed the Rosetta Java code implementation.

I tried do this LZW coding with my own Dictionary and not with the ASCII Dictionary which was used. When I try with my own Dictioanry there is a problem about decoding... The result is wrong, because each of decoded word don't view the first 'a' letter. The result have to be 'abraca abrac abra' and not 'braca brac bra'

I see the problem in decode() method at String act = "" + (char)(int)compressed.remove(0); This will remove all first 'a' letter. But I don't have any ideas how can I modify this line... For example if I use the String act = "";instead of above line... the coding will be very wrong, or use another command... I don't know how can I solve this little problem... Or maybe I am looking for on the bad way for the solution.

public class LZW {  


public static List<Integer> encode(String uncompressed) {

    Map<String,Integer> dictionary = DictionaryInitStringInt();        
    int dictSize = dictionary.size();

    String act = "";
    List<Integer> result = new ArrayList<Integer>();

    for (char c : uncompressed.toCharArray()) {
        String next = act + c;
        if (dictionary.containsKey(next))
            act = next;
        else {
            result.add(dictionary.get(act));
            // Add next to the dictionary.
            dictionary.put(next, dictSize++);
            act = "" + c;
        }
    }

    // Output the code for act.
    if (!act.equals(""))
        result.add(dictionary.get(act));
    return result;
} 

public static String decode(List<Integer> compressed) {

    Map<Integer,String> dictionary = DictionaryInitIntString();        
    int dictSize = dictionary.size();

    String act = "" + (char)(int)compressed.remove(0);
    //String act = "";
    String result = act;

    for (int k : compressed) {            
        String entry;
        if (dictionary.containsKey(k))
            entry = dictionary.get(k);
        else if (k == dictSize)
            entry = act + act.charAt(0);
        else
            throw new IllegalArgumentException("Nincs ilyen kulcs: " + k);

        result += entry;

        dictionary.put(dictSize++, act + entry.charAt(0));

        act = entry;
    }
    return result;
}

public static Map<String,Integer> DictionaryInitStringInt()
{               
    char[] characters = {'a','b','c','d','e','f','g','h','i','j', 'k','l','m','n',
                    'o','p','q','r','s','t','u','v','w','x','y','z',' ','!',
                    '?','.',','};
    int charactersLength = characters.length;

    Map<String,Integer> dictionary = new HashMap<String,Integer>();

    for (int i = 0; i < charactersLength; i++)
            dictionary.put("" + characters[i], i); 

    return dictionary;
}

public static Map<Integer,String> DictionaryInitIntString()
{               
    char[] characters = {'a','b','c','d','e','f','g','h','i','j', 'k','l','m','n',
                    'o','p','q','r','s','t','u','v','w','x','y','z',' ','!',
                    '?','.',','};
    int charactersLength = characters.length;

    Map<Integer,String> dictionary = new HashMap<Integer,String>();

    for (int i = 0; i < charactersLength; i++)
            dictionary.put(i,"" + characters[i]); 

    return dictionary;
}

public static void main(String[] args) {

    List<Integer> compressed = encode("abraca abrac abra");
    System.out.println(compressed);

    String decodeed = decode(compressed);
    // decodeed will be 'braca brac bra'
    System.out.println(decodeed);
}

}

Solution

The rosetta example use

"" + (char) (int) compressed.remove(0);

because the first 256 entries of the dictionnary map exactly the 'char' values.

With a custom dictionnary this line should be:

String act = dictionary.get(compressed.remove(0));

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow