Question

When testing an answer for another user's question I found something I don't understand. The problem was to replace all literal \t \n \r characters from a string with a single space.

Now, the first pattern I tried was:

/(?:\\[trn])+/

which surprisingly didn't work. I tried the same pattern in Perl and it worked fine. After some trial and error I found that PHP wants 3 or 4 backslashes for that pattern to match, as in:

/(?:\\\\[trn])+/

or

/(?:\\\[trn])+/

these patterns - to my surprise - both work. Why are these extra backslashes necessary?

Was it helpful?

Solution

You need 4 backslashes to represent 1 in regex because:

  • 2 backslashes are used for unescaping in a string ("\\\\" -> \\)
  • 1 backslash is used for unescaping in the regex engine (\\ -> \)

From the PHP doc,

escaping any other character will result in the backslash being printed too1

Hence for \\\[,

  • 1 backslash is used for unescaping the \, one stay because \[ is invalid ("\\\[" -> \\[)
  • 1 backslash is used for unescaping in the regex engine (\\[ -> \[)

Yes it works, but not a good practice.

OTHER TIPS

Its works in perl because you pass that directly as regex pattern /(?:\\[trn])+/

but in php, you need to pass as string, so need extra escaping for backslash itself.

"/(?:\\\\[trn])+/"

The regex \ to match a single backslash would become '/\\\\/' as a PHP preg string

The regular expression is just /(?:\\[trn])+/. But since you need to escape the backslashes in string declarations as well, each backslash must be expressed with \\:

"/(?:\\\\[trn])+/"
'/(?:\\\\[trn])+/'

Just three backspaces do also work because PHP doesn’t know the escape sequence \[ and ignores it. So \\ will become \ but \[ will stay \[.

Use str_replace!

$code = str_replace(array("\t","\n","\r"),'',$code);

Should do the trick

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow