Question

For the pure purpose of obfuscation, the first three lines seem to clean up the script pretty nicely from unnecessary enters.

  1. Can anyone tell me what the lines 1 - 4 actually do? Only thing I know from trial and error is that if I comment out the fourth line the site works, if I leave it in place the site breaks.

    <?php
    
    header("Content-type: text/javascript; charset=UTF-8");   
    ob_start("compress"); 
    function compress($buffer) 
    {
        # remove extra or unneccessary new line from javascript
        $buffer = preg_replace('/([;])\s+/', '$1', $buffer);
        $buffer = preg_replace('/([}])\s+(else)/', '$1else', $buffer);
        $buffer = preg_replace('/([}])\s+(var)/', '$1;var', $buffer);
        $buffer = preg_replace('/([{};])\s+(\$)/', '$1\$', $buffer);
    
        return $buffer;
    }
    
  2. Is there a better way to remove one or multiple line enters from JavaScript?

Était-ce utile?

La solution

Dissection of all four regular expressions

Let's try and dissect each one of the regular expressions.

First regex

$buffer = preg_replace('/([;])\s+/', '$1', $buffer);

Explanation

(      # beginning of the first capturing group
 [;]   # match the literal character ';'
)      # ending of the first capturing group
\s+    # one or more whitespace characters (including newlines)

The above regular expression removes any whitespace that occurs immediately following a semicolon. ([;]) is a capturing group, meaning if a match is found, it is stored into a backreference, so we could use it later. For example, if our string was foo; <space><space>, then the expression would match ; and the whitespace characters. The replacement pattern here is $1, which means the entire matched string would be replaced with just a semicolon.


Second regex

$buffer = preg_replace('/([}])\s+(else)/', '$1else', $buffer);

Explanation

(      # beginning of the first capturing group
 [}]   # match the literal character ';'
)      # ending of the first capturing group
\s+    # one or more whitespace characters
(else) # match and capture 'else'

The above regex removes any whitespace between a closing curly brace (}) and else. The replacement pattern here is $1else, which means, the string with whitespace will get replaced by what was captured by the first capturing group ([}]) (which is just the semicolon) followed by the keyword else. Nothing much to it.


Third regex

$buffer = preg_replace('/([}])\s+(var)/', '$1;var', $buffer);

Explanation

(      # beginning of the first capturing group
 [}]   # match the literal character ';'
)      # ending of the first capturing group
\s+    # one or more whitespace characters
(var)  # match and capture 'var'

This is the same as previous regex. The only difference here is the keyword - var instead of else. The semicolon character is optional in JavaScript. But if you want to write multiple statements in a single line, there's no way for the interpreter to know they're multiple lines, so a ; will need to be used to terminate each statement.


Fourth regex

$buffer = preg_replace('/([{};])\s+(\$)/', '$1\$', $buffer);

Explanation

(      # beginning of the first capturing group
 [{};] # match the literal character '{' or '}' or ';'
)      # ending of the first capturing group
\s+    # one or more whitespace characters
(      # beginning of the second capturing group
 \$    # match the literal character '$'
)      # ending of the second capturing group

The replacement pattern here is $1\$, which means the entire matched string would be replaced with what was matched by the first capturing group ([{};]) followed by a literal $ character.

Sidenote

This answer was only meant to explain the four regexes and what it does. The expressions could be improved a lot, but I'm not going into that as it's not the correct approach. As Qtax points out in the comments, you really should use a proper JS minifier to achieve this task. You might want to check out Google's Closure Compiler - it looks pretty neat.

If you're still confused how it works, don't worry. Learning regexes can be difficult in the beginning. I suggest you use this website - http://regularexpressions.info. It is a pretty decent resource for learning regular expressions. If you're looking for a book, you might want to check out Mastering Regular Expressions By Jeffrey Friedl.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top