Question

I'm trying to escape regex-reserved characters with a backslash (don't ask -- suffice it to say I'm NOT trying to parse HTML :) ) And I'm getting something odd.

$regex_chars = array('[' , '\\' , '^', '$' , '.' , '|' , 
    '?' , '*' , '+' , '(' , ')');  
$regex_chars_escaped = array('\[ ' , '\\\\ ' , '\^ ', '\& ' , 
    '\. ' , '\| ' , '\? ' , '\* ' , '\+ ' , '\( ' , '\)'); 
$escaped_string = str_replace($regex_chars,$regex_chars_escaped,
     implode("",$regex_chars));
echo implode('&nbsp;',$regex_chars) . "<br />";
echo $escaped_string;

Spaces are for clarity. This is the output

[ \ ^ $ . | ? * + ( )
\\ [ \\ \^ \& \. \| \? \* \+ \( \)

So all is good, except for the first part. Where does the "\\" come from and why isn't it "\[" ?

Was it helpful?

Solution

Why not simply use preg_quote?

OTHER TIPS

I believe it's just because of the order you're putting the chars in the array. Try this:

$regex_chars = array('\\' , '[' , '^', '$' , '.' , '|' , 
        '?' , '*' , '+' , '(' , ')');  
$regex_chars_escaped = array( '\\\\ ' ,'\[ ', '\^ ', '\& ' , 
        '\. ' , '\| ' , '\? ' , '\* ' , '\+ ' , '\( ' , '\)'); 

And you should get the expected output. Check the 'potential gotchas' section in the str_replace function spec

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top