Question

I'm trying to use simple_html_dom to remove all the spans from a snippet of HTML, and I'm using the following:

$body = "<span class='outer' style='background:red'>x<span class='mid' style='background:purple'>y<span class='inner' style='background:orange'>z</span></span></span>";
$HTML = new simple_html_dom;
$HTML->load($body);   
$spans = $HTML->find('span');
foreach($spans as $span_tag) {
    echo "working on ". $span_tag->class . " ... ";
    echo "setting " . $span_tag->outertext . " equal to " . $span_tag->innertext . "<br/>\n";
    $span_tag->outertext = (string)$span_tag->innertext;
}
$text =  $HTML->save();
$HTML->clear();
unset($HTML);
echo "<br/>The Cleaned TEXT is: $text<br/>";

And here's the result in my browser:

http://www.pixeloution.com/RAC/clean.gif

So why is it I'm only ending up with the outer most span removed?

Edit

Actually if there's an easier way to do this, I'm game. The object is to remove the tags but keep anything inside them including other tags, or else I'd just use $obj->paintext

Edit #2

Okay ... apparently I got it working, although oddly enough I'd still like to actually understand the problem if anyone ran into this before. Knowing it was only removing the outermost span, I did this:

function cleanSpansRecursive(&$body) {

    $HTML = new simple_html_dom;
    $HTML->load($body); 
    $spans = $HTML->find('span');
    foreach($spans as $span_tag) {
        $span_tag->outertext = (string)$span_tag->innertext;
    }

    $body =  (string)$HTML;
    if($HTML->find('span')) {
        $HTML->clear();
        unset($HTML);
        cleanSpansRecursive($body);
    } else {
        $HTML->clear();
        unset($HTML);
    }  
}

And it seems to work.

Was it helpful?

Solution

I don't have simple_html_dom installed on my machine or dev server so I can't test, but from the looks of it, setting $span_tag->outertext will create new span objects inside the outer span, so the old references will no longer exist in $HTML. Going from innermost to outer should fix it since the references would be kept intact.

EDIT: In your second edit, you are finding the newly-created spans every time you do a replacement, which is why it works.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top