Usa QueryPath per ottenere il contenuto di elementi HTML arbitrari

https://stackoverflow.com/questions/5414269

29-10-2019
|

Domanda

Sto utilizzando la libreria PHP QueryPath per estrarre dati da una raccolta di vecchi file HTML e per la maggior parte ha utilizzato i selettori CSS disponibili tramite la funzione Find () per estrarre i dati. Tuttavia, non tutti gli elementi contenenti dati di cui ho bisogno per estrarre hanno un identificatore CSS unico, quindi ho usato una brutta combinazione di regexp e querypath per estrarre i dati.

<ul class="list><li>Data1</li><li>Data2</li></ul>

Come posso, ad esempio, estrarre "Data2" in modo pulito da questo elemento? Esiste una funzione QueryPath che mi permetterà di specificare, ad esempio, il secondo figlio di un elemento genitore come elemento da recuperare?

Soluzione

To get the nth matched object you can use QueryPath::get(n-1).

Altri suggerimenti

There are actually several ways to do this. The easiest is to use the CSS 3 pseduclass :nth-of-type(). This gets the second LI directly inside of the UL:

qp($html, 'ul>li:nth-of-type(2)');

:nth-of-type and other CSS 3 selectors take what are called "an+b" rules, where you can say how many items make up a group, and then say which item from the group you want. For example, tr:nth-of-type(4n+2) will break up table rows into groups of 4, and then return the second element in each group. :even and :odd are just shorthand for 2n and 2n+1.

Other CSS that might be worth looking into:

':nth'
':first-of-type', ':first'
':last-of-type', ':last'
':even', ':odd'
':not()', ':has()', and ':contains()'

You can also get all of the LI elements, and then get just the second one:

qp($html, 'li')->eq(2);

Or, as a previous poster pointed out, you can get the actual DOMNode object for the second one using get():

qp($html, 'li')->get(2);

If you have really sophisticated needs, you can use filter() to take a list, and run it through a custom function.

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow