Question

Using PHP, how can I remove HTML text that is placed before/after a certain number of <br> tags?

For example, I have this,

<div>
    <div><img sec=""></div>
    <br>
    <h3>title</h3>
    <span>some text here</span>
    <br>
    Some text that I want to remove.
    <br>
    <br>
</div>

I'd like to remove the string before the last two <br> tags. Or It could be said after the second <br>.

I tried explode() with <br> and omitted the last two array elements with array_push(). However, I had to add </div> to close the outer tag. When the outer tag dynamically changes, it's not a good idea.

Does anybody have a solution for this?

Was it helpful?

Solution 3

Okey, this is what I've achieved. Although this might not be the most efficient way but I'll share. I used DOMinnerHTML() introduced here and preg_split(). This removes the text after the last three <br> tags.

<?php 
$html = <<<STR
<div>
    <div><img sec=""></div>
    <br>
    <h3>title</h3>
    <span>some text here</span>
    <br>
    Some text that I want to remove.
    <br>
    <br>
</div>
STR;

$doc = new DOMDocument;
$doc->loadHTML($html);
$node = $doc->getElementsByTagName('div')->item(0);
$innerHtml = DOMinnerHTML($node);
$arrHtml = preg_split('/<br.*?\/?>/i', $innerHtml);     // devide the string into arrays by <br> or <br />
array_splice($arrHtml, -3);     // remove the last three elements   
$edited = implode(" ", $arrHtml);

echo $edited;

function DOMinnerHTML($element) 
{ 
    $innerHTML = ""; 
    $children = $element->childNodes; 
    foreach ($children as $child) 
    { 
        $tmp_dom = new DOMDocument(); 
        $tmp_dom->appendChild($tmp_dom->importNode($child, true)); 
        $innerHTML.=trim($tmp_dom->saveHTML()); 
    } 
    return $innerHTML; 
} 
?> 

OTHER TIPS

In addition to Joshua's answer, if you want to do it in easier way you can use simple html dom library which can be found in the link below. Just go through their documentation. This library comes handy many times when you encounter the problems like you have now and when you want to scrape the web contents.

http://simplehtmldom.sourceforge.net/

What you'll want to be doing is string matching, using regular expressions, to get the text before the two <br> tags and after the previous <br> tag. See the following:

http://www.regular-expressions.info/php.html

I did the following:

function limitTag($str,$tag,$limit) {
  $array = explode($tag,$str);
  $newStr = '';
  $i=0;
  foreach ($array as $child){
    if ($i<=$limite){
      if ($i>0) $newStr .= $tag;
      $newStr .= $child;
      $i++;
    } else break;
  }
  return $newStr;
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top