Maybe this can do the trick? You don't need lookbehinds.
echo preg_replace("/[\s\n]*?(\<(\/ul>|li[\s>]))/i", "$1", $your_document);
Where $your_document
is HTML code you want to deal with.
So, if this is your HTML:
<section>
<hgroup>
<h1 style="text-align: center;">Koptitel 1</h1>
<h2 style="text-align: center;">Subtitel</h2>
</hgroup>
<ul class="sample1">
<li class="sample2">Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vestibulum placerat, urna eget ultricies egestas, lectus mi tincidunt nulla, ut molestie odio lectus ut arcu.</li>
<li class="sample2">Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vestibulum placerat, urna eget ultricies egestas, lectus mi tincidunt nulla, ut molestie odio lectus ut arcu.</li>
</ul>
</section>
Output for that looks like:
<section>
<hgroup>
<h1 style="text-align: center;">Koptitel 1</h1>
<h2 style="text-align: center;">Subtitel</h2>
</hgroup>
<ul class="sample1"><li class="sample2">Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vestibulum placerat, urna eget ultricies egestas, lectus mi tincidunt nulla, ut molestie odio lectus ut arcu.</li><li class="sample2">Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vestibulum placerat, urna eget ultricies egestas, lectus mi tincidunt nulla, ut molestie odio lectus ut arcu.</li></ul>
</section>
This removes all whitespaces and new-line (\n
) characters between <ul> and <li>
, between </li> and <li>
, and between </li> and </ul>
tags making entire <ul>
element written in one line with no spaces between >
and <
inside. This regular expression is not case-sensitive so it also looks for <LI>
as well as <li>
.