Question

I have this sample code that will extract the values of each tags. And aside from that get the class name of that tag..

<?php
$doc = new DOMDocument;
$doc->loadxml( <<< eox
<tr class="calendar_row" data-eventid="42023">
    <td class="date"/>
    <td class="time">All Day</td>
    <td class="currency">CAD</td>
    <td class="impact">
        <span title="Non-Economic" class="holiday"/>
    </td>
    <td class="event">
        <span>Bank Holiday</span>
    </td>
    <td class="detail">
        <a class="calendar_detail level1" data-level="1" title="Open Detail"/>
    </td>
    <td class="actual"/>
    <td class="forecast"/>
    <td class="previous"/>
    <td class="graph"/>
</tr>
eox
);
$xpath = new DOMXPath($doc);

foreach( $xpath->query('//tr[@data-eventid="42023"]/td[@class]') as $n ) {
    echo $n->nodeName.'-'.$n->nodeValue."<br />";
}
?>

using the snippet above, all i want is to get those values even if some tags arent well formatted (im scrapping a web source).. How can i do this in DOMDocument XPath Query. I am having trouble 'cause the values being fetch are:

td-
td-All Day
td-CAD
td- 
td-Bank Holiday 
td- 
td-
td-
td-
td-

instead of:

date-
time-All Day
currency-CAD
impact- 
event-Bank Holiday 
detail- 
actual-
forecast-
previous-
graph-
Was it helpful?

Solution

Instead of doing $n->nodeName you should be doing this $n->getAttribute('class').

Demo: http://codepad.viper-7.com/ktpnv2

OTHER TIPS

echo $n->getAttribute("class") . '-' . $n->nodeValue . "<br />";
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top