Assuming that all square brackets are balanced and not nested, you can use this code:
$pattern = '~(?:\[|(?!\A)\G)[^]\r\n]*\K\R+~';
$txt = preg_replace($pattern, '', $txt);
pattern details:
(?: # open a non capturing group
\[ # a literal opening square bracket
| # or
(?!\A)\G # the position in the string after the last match
) # close the non capturing group
[^]\r\n]* # zero or more characters that are not ] or CR or LF
\K # resets all from match result
\R+ # any type of newline one or more times
The pattern above assumes that there is always a closing square bracket, if the closing square bracket is missing, all the text after the opening square bracket is processed until the end of the string.
If you want to change this behavior, you must add a lookahead assertion to check the presence of the closing square bracket (but note that this makes the pattern slower):
(?:\[|(?!\A)\G)[^]\r\n]*\K\R+(?=[^]]*])
About \G
:
This is an anchor (as ^
$
\A
\z
are) that represents the position in the string after the last match, however since there is no last match at the start, \G
is set to the start of the string (\A
or ^
). To avoid this case, a way is to add a negative lookahead or lookbehind after or before \G
(This is exactly the same since you are dealing with zero-width assertions): (?!\A)
If you don't care about square brackets and only want to skip content between curly brackets, you can do this:
$pattern = '~(\R?\h*{[^}]*})|\R+~';
$txt = preg_replace($pattern, '$1', $txt);
where curly brackets parts (with the leading newline as in you example) are replaced by themselve or this:
$pattern = '~\R?\h*{[^}]*}(*SKIP)(*FAIL)|\R+~';
$txt = preg_replace($pattern, '', $txt);
where the same parts are skipped because the subpattern is forced to fail with (*FAIL)
and (*SKIP)
forbids to retry a subpattern at the same position (when the subpattern fails).