I have HTML comments being wrapped in Li and P tags :(

https://stackoverflow.com/questions/1217197

07-07-2019
|

Question

I have content that is first htmlentities and then stripslashes followed by nl2br.

This means a watermark at the end ends up as:

<li><p><!-- watermark --></p></li>

Not very useful. I have the code below to try and strip the html comments and stop it displaying but its not very good at it!

$methodfinal = str_replace('<li><p><!--', '<!--', $method);
$methodfinal2 = str_replace('--></p></li>', '-->', $methodfinal);
echo $methodfinal2;

anyone got any ideas?

Solution

EDIT: following Zed's and your comments I've done some testing and this is what you should use:

$final = preg_replace('/<li><p>[\s]*?&lt\;!--(.*?)--&gt\;<\/p><\/li>/m', "<!--$1-->", $z);

Here is a breakdown of the RE:

<li><p>

this is obvious

[\s]*?

because you have a few spaces and a newline between the <li> and the comment, but we want the least number of newlines so we use the non greedy *? (it sould work with * as well)

&lt\;

need to escape the ;

!--(.*?)--

again we use *? so we would match only this line (other wise if you had the same line again it wold match from the first one to the last one

&gt\;<\/p><\/li>

same as above

/m'

so php would treat newlines as whitespace (i am not sure about this but it seems to be working)

OTHER TIPS

Something like this?

$final = preg_replace("/<li><p>(<!--.*-->)<\/p><\/li>/", "$1", $original);

@Zed:

Lets be more caring:

$final = preg_replace("/<li><p>(<!--.*?-->)<\/p><\/li>/", "$1", $original);
# use .*? every time over .* unless you specificly want what it does
# .*? matches as less as it can
# .* matches as much as it can

even better:

$final = preg_replace("/<li><p>(<!--[^\-\>]+-->)<\/p><\/li>/", "$1", $original);
# [^\-\>]+ will look for any character that is not - or > 
# so will perform faster

Just trying to advocate better regex practice. Hope this helps.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow