Question

I want to get all occurrences of tds in a string. At the moment im using $tds = preg_split( '#(?=<td>)#', $toDisplayNotes );

but this does not get all the tds. is it possible to produce an array that looks like this:

array {
  [0] => "<td>hello</td>"
  [1] => "<td align="right">world</td>"
  [2] => "<td>another td</td>"
}
Was it helpful?

Solution

Using the DOMDocument class, you can easily get all cells like so:

$dom = new DOMDocument;
$dom->loadHTML($htmlString);
$cells = $dom->getElementsByTagName('td');
$contents = array();
foreach($cells as $cell)
{
    $contents[] = $cell->nodeValue;
}
var_dump($contents);

The $cells var is a DOMNodeList, so it has some methods that you might be able to use. The $cell variable will be assigned a particular instance of DOMNode on each iteration, which has all sorts of methods/properties that could be useful for your use-case, too (like getAttribute)
Looking at your question, though, you'll be wanting the outer html (including the tags) in your array. Now that's easy:"

$markup = array();
foreach($cells as $cell)
{
    $markup[] = $dom->saveXML($cell);
}

Side-note:
Perhaps a for loop will be more performant than foreach. I haven't tested/compared the two, but you could try if you see a difference the approach above and this one:

$markup = array();
for($i=0, $j = $cells->length;$i<$j;$i++)
{
    $markup[] = $dom->saveXML($cells->item($i));
}

The reason why I'm using saveXML and not saveHTML is simple: saveHTML will generate a valid DOM (including opening <html> tags and what have you). Not what you want. That's why saveXML is, in this case, the better choice.
A slightly related question of mine here

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top