Question

I've seen a demostration in js1k contest and I'm amazed with this obfuscation algorithm. Somebody can explain how it works? I got the obfuscation results, but i can't understand how its works (the obfuscation algorithm) What sorcery is this? LOL

I've extracted this:

for(
    encoded_string = '... encoded string ...';
    g = /[^ -IM[-~]/.exec(encoded_string);
){
    encoded_string = encoded_string.split(g).join(encoded_string.split(g).shift());
}

// ... Use encoded_string to do what you want

I think that maybe the point is in the REGEXP.

The original code (http://js1k.com/2014-dragons/details/1854)

If you want you can see the results of obfuscation here(http://jsbin.com/xeruqita/1/edit) i've added the first var lines.

Was it helpful?

Solution

Basically, this logic appears to strip out each instance of a group of characters from a string, and returns the "cleaned" string.

I can't determine what is special about the group of characters that is being removed (^ -IM[-~), but I can walk you through the process of what they were doing . . .

  • set up a for loop

    for(
    
  • initialize the the loop variable

        encoded_string = '... encoded string ...';
    
  • set the loop condition . . . note: exec() returns null if there is no match and null will evaluate to false, ending the loop

        g = /[^ -IM[-~]/.exec(encoded_string);
    
  • end the loop conditions

    )
    
  • this next step has multiple parts . . . let me go through it in steps

    {
        encoded_string = encoded_string.split(g).join(encoded_string.split(g).shift());
    }
    
  • during each loop, take the result of the exec (g) and use that as a input to split the original encoded_string (this is done in two places).

    g is an array, but by using it as the input to split(), it is cast to a string. The result of this is to use the value that was matched by the string (pretty interesting, actually, considering the makeup of g . . . see the specs here: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/exec).

    By using the match as the "splitter" the code is, essentially, removing all instances of that match, and leaving an array of all of "chunks" of all of the remaining characters (example: if the match was c and the string was abcde, splitting the string would result in ["ab", "de"])

  • Then split string is then used twice:

    1. it is used as the source of a join, to create the resulting string

    2. the first element of that "split array" is used as the "join character"

      So, that second part makes me think that there must be the assumption that the encoded_string values must always start with the series of characters mapped in the exec() regex pattern, because, if it is, the first character of the "split array" will always be a blank string, meaning that the result of the join() will be the original encoded_string value, with all of the matched characters removed.

      If that assumption was not true, there would be an actual character in that first position of the "split array", and that character would be inserted in between all of the elements in the split array, creating a very different result string, from the join.

    Let me show some examples, to better illustrate . . .

    1) MY ASSUMPTION IS TRUE


    This uses a small input string, and a pattern that matches the first characters of that string:

    encoded_string = "abcdeabc";

    g = /[abc]/.exec(encoded_string);

    For the first loop,

    • g would find "a" as the first match,
    • the "split string" would be ["", "bcde", "bc"],
    • the "join character" would be "", and
    • the result of the join() would be "bcdebc"

    .

    For the second loop,

    • g would find "b" as the first match,
    • the "split string" would be ["", "cde", "c"],
    • the "join character" would be "",
    • and the result of the join() would be "cdec"

    .

    For the third loop,

    • g would find "c" as the first match,
    • the "split string" would be ["", "de", ""],
    • the "join character" would be "",
    • and the result of the join() would be "de"

    .

    There would be no more matches, so the loop would end there, with a final encoded string value of "de".

    2) MY ASSUMPTION IS NOT TRUE


    This uses the same input string as above, but it uses a pattern that does not match the first characters of that string:

    encoded_string = "abcdeabc";

    g = /[bcd]/.exec(encoded_string);

    For the first loop,

    • g would find "b" as the first match,
    • the "split string" would be ["a", "cdea", "c"],
    • the "join character" would be "a",
    • and the result of the join() would be "aacdeaac"

    .

    For the second loop,

    • g would find "c" as the first match,
    • the "split string" would be ["aa", "deaa", ""],
    • the "join character" would be "aa",
    • and the result of the join() would be "aaaadeaaaa"

    .

    For the third loop,

    • g would find "d" as the first match,
    • the "split string" would be ["aaaa", "eaaaa"],
    • the "join character" would be "aaaa",
    • and the result of the join() would be "aaaaaaaaeaaaa"

    .

    There would be no more matches, so the loop would end there, with a final encoded string value of "aaaaaaaaeaaaa".

    While it's possible that this second approach might be what the author was trying to do, my money is on the first approach to be the more likely functionality. ;)


Phew! That was a lot. Hope that helped clarify!

OTHER TIPS

I found the post mortem for the Minecraft demo at https://reindernijhoff.net/2014/04/js1k-post-mortem-minecraft/

The developer mentions that he packed the program using RegPack, which is a library for creating self unpacking minified Javascript code. This means the purpose is not obfuscation, but compression.

Gihub: https://github.com/Siorki/RegPack

Online-Demo: http://siorki.github.io/regPack.html

For details of what the upacker actually does, the other reply to your question already covers this quite well.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top