Question

I'm having trouble with DOMDocument/XPath. The HTML (I have no control of it) looks like this:

.. random html ..

<div class="separator"></div>
<div class="date">01-01-1900</div>

<div class="item"><div>1 HTML garbage</div></div>
<div class="item"><div>2 HTML garbage</div></div>


<div class="separator"></div>
<div class="date">12-12-2012</div>

<div class="item"><div>3 HTML garbage</div></div>
<div class="item"><div>4 HTML garbage</div></div>
<div class="item"><div>5 HTML garbage</div></div>
<div class="item"><div>6 HTML garbage</div></div>

.. more random html ...

How I need my data:

$result = array(
    '01-01-1900' => array(
        array('name' => '1 HTML garbage'),
        array('name' => '2 HTML garbage')
    ),
    '12-12-2012' => array(
        array('name' => '3 HTML garbage'),
        array('name' => '4 HTML garbage'),
        array('name' => '5 HTML garbage'),
        array('name' => '6 HTML garbage')
    )
);

Since the depth can change, I can't use a fixed path from my browser console. How I can group by date? Right now I can get a list of the items by using:

$xpath->query('//*[contains(concat(" ", normalize-space(@class), " "), " item ")]');
Was it helpful?

Solution

Since you are using php you can first get all dates and iterate over those dates to get the items according to this (untested)

//../node[contains(@class,'item') and preceding-sibling::node[contains(text(),'12-12-2012')]]

with 12-12-2012 as the searched value.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top