Pergunta

HTML:

<!--a lot of HTML before-->
<div class="quoteheader">
  <div class="topslice_quote"><a href="htsomelink">Some text</a></div>
</div>
<blockquote class="bbc_standard_quote">Some text<br />
</blockquote>
<div class="quotefooter">
  <div class="botslice_quote"></div>
</div>
<br />
<!--a lot of HTML after-->

I NEED TO: remove everything between div.quoteheader and first <br/> after, so result should looks like:

<!--a lot of HTML before-->
<!--a lot of HTML after-->

I TRIED:

$message = preg_replace('/<div\sclass=\"quoteheader\">[^<]+<\/div>/i', '', $string)
Foi útil?

Solução

I would recommend creating a DomDocument object with the HTML and then using RemoveChild.

Outras dicas

You would be much better served used an XML/HTML/DOM parser than regex. SimpleXML is pretty simple.

You would just load up the HTML w/ SimpleXML or some other HTML/XML parser, then use xpath to find the nodes and/or comments you're looking for, then remove them.

An alternative... if you can delimit the code with comments, like this:

<!--code-->
<div> .. </div>
<!--/code-->

you can remove everything between that:

$newstr = preg_replace('/<!--code-->.*?<!--\/code-->/is', '', $htmlstring);
preg_replace('/(\<div\ class="quoteheader"\>)(.+)(<br \/>)/si', '', $string)
Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top