Minification: Using regex to remove linebreaks from JavaScript code

Question

Dissection of all four regular expressions

Let's try and dissect each one of the regular expressions.

First regex

$buffer = preg_replace('/([;])\s+/', '$1', $buffer);

Explanation

(      # beginning of the first capturing group
 [;]   # match the literal character ';'
)      # ending of the first capturing group
\s+    # one or more whitespace characters (including newlines)

The above regular expression removes any whitespace that occurs immediately following a semicolon. ([;]) is a capturing group, meaning if a match is found, it is stored into a backreference, so we could use it later. For example, if our string was foo; <space><space>, then the expression would match ; and the whitespace characters. The replacement pattern here is $1, which means the entire matched string would be replaced with just a semicolon.

Second regex

$buffer = preg_replace('/([}])\s+(else)/', '$1else', $buffer);

Explanation

(      # beginning of the first capturing group
 [}]   # match the literal character ';'
)      # ending of the first capturing group
\s+    # one or more whitespace characters
(else) # match and capture 'else'

The above regex removes any whitespace between a closing curly brace (}) and else. The replacement pattern here is $1else, which means, the string with whitespace will get replaced by what was captured by the first capturing group ([}]) (which is just the semicolon) followed by the keyword else. Nothing much to it.

Third regex

$buffer = preg_replace('/([}])\s+(var)/', '$1;var', $buffer);

Explanation

(      # beginning of the first capturing group
 [}]   # match the literal character ';'
)      # ending of the first capturing group
\s+    # one or more whitespace characters
(var)  # match and capture 'var'

This is the same as previous regex. The only difference here is the keyword - var instead of else. The semicolon character is optional in JavaScript. But if you want to write multiple statements in a single line, there's no way for the interpreter to know they're multiple lines, so a ; will need to be used to terminate each statement.

Fourth regex

$buffer = preg_replace('/([{};])\s+(\$)/', '$1\$', $buffer);

Explanation

(      # beginning of the first capturing group
 [{};] # match the literal character '{' or '}' or ';'
)      # ending of the first capturing group
\s+    # one or more whitespace characters
(      # beginning of the second capturing group
 \$    # match the literal character '$'
)      # ending of the second capturing group

The replacement pattern here is $1\$ , which means the entire matched string would be replaced with what was matched by the first capturing group ([{};]) followed by a literal $ character.

Sidenote

This answer was only meant to explain the four regexes and what it does. The expressions could be improved a lot, but I'm not going into that as it's not the correct approach. As Qtax points out in the comments, you really should use a proper JS minifier to achieve this task. You might want to check out Google's Closure Compiler - it looks pretty neat.

If you're still confused how it works, don't worry. Learning regexes can be difficult in the beginning. I suggest you use this website - http://regularexpressions.info. It is a pretty decent resource for learning regular expressions. If you're looking for a book, you might want to check out Mastering Regular Expressions By Jeffrey Friedl.