Regex to trim text between tags

https://stackoverflow.com/questions/19875305

30-07-2022
|

質問

I expected this to be a simple regex but I guess my head isn't screwed on this morning!

I'm taking the source code of a page and tidying it up with a bunch of other preg_replaces, so by the time we get to the regex below, the result is already a single line string with things like comments stripped out, etc.

All I'm looking to do now is trim the texts between > and < char's down to remove extra whitespace. I.e.

<p>    hello world   </p>

should become

<p>hello world</p>

I figured this would do the trick, but it seems to do nothing?

$data = trim(preg_replace('/>(\s*)([^\s]*?)(\s*)</', '>$2<', $data));

Cheers.

解決 2

you can use the /e modifier in regex to use the trim() function while replacing.

$data = preg_replace('/>([^<]*)</e', '">" . trim("$1") . "<"', $data);

他のヒント

Here's a ridiculous way to do it lol:

$str = "<p>    hello world   </p>";
$strArr = explode(" ", $str);
$strArr = array_filter($strArr);
var_dump(implode(" ",$strArr));

Use the power of arrays to remove the white spaces lol

A regex could be:

>\s+(.*[^\s])\s+<

but don't use it, there are better ways to reach that goal (example: HTMLtidy)

You may use this snippet of code.

$x = '<p>    hello world   </p>';
$foo = preg_replace('/>\s+/', '>', $x);  //first remove space after ">" symbol
$foo = htmlentities(preg_replace('/\s+</', '<', $foo)); //now remove space before "<" symbol
echo $foo;

ライセンス： CC-BY-SA と帰属

所属していません StackOverflow