Frage

I'd like to find a way in PHP to remove some tags from a string. I have this string :

hello
<div class="test-1 safe">Hi everybody</div>
<div>Hello world</div>
<p>Hi guys, this is a text</p>
<div class="test">this is another text</div>

I'm trying to write a method to remove all div tags from a string except those which have the "safe" class + removing the safe class. As example, I'd like to output in this case :

hello
<div class="test-1">Hi everybody</div>
Hello world
<p>Hi guys, this is a text</p>
this is another text

I started with reg_ex :

public static function clean_text($text, $parent = '')
{

    $cleanText = preg_replace("/<\/?div[^>]*\>/i", "", $cleanText);
    return $cleanText;
}

But it removed all divs. Then, I moved to DomDocument, but I still have issues ( html tags like doctype inserted and encoding issues ).

public static function clean_text($text, $parent = '')
{
    //some unnecessary code before...
    $cleanText = $text;

    //parsing DOM
    $dom = new \DOMDocument();
    $dom->loadHTML($cleanText);

    $divs = $dom->getElementsByTagName('div');
    $i = $divs->length - 1;
    while ($i > -1) {
        $div = $divs->item($i);
        if ($div->hasAttribute('class') && strstr($div->getAttribute('class'), 'safe'))
        {
            $class = $div->getAttribute('class');
            $class = str_replace('safe','',$class);
            $div->removeAttribute('class');
            $div->setAttribute('class',$class);
        }
        else
        {
            $txt = $div->nodeValue;
            $newelement = $dom->createTextNode($txt);
            $div->parentNode->replaceChild($newelement, $div);
        }
        $i--;
    }

    $text = $dom->saveHTML();

    return $text;
}

Is there an easiest way ?

Thanks a lot for your help.

War es hilfreich?

Lösung

You could do that with a negative lookahead:

$pattern = array(

// replace divs not followed by class ... safe
'~<div(?![^>]*class="[^"]+ safe")[^>]*>(.*?)</div>~s',

// then remove safe
'~(<div[^>]+class="[^"]+) safe"~s');

$replace = array('\1', '\1"');

$str = preg_replace($pattern, $replace, $str);
echo "<pre>".htmlspecialchars($str)."</pre>";

output:

hello
<div class="test-1">Hi everybody</div>
Hello world
<p>Hi guys, this is a text</p>
this is another text
Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top