문제

My understanding is that once a document is loaded into Jsoup, using Jsoup.parse(), no parsing is required again as a neatly hierarchical tree is ready for programmer's use.

But what I am not sure whether top-level select() is more costly than inner-level select().

For example, if we have a <p> buried inside many nested <div>s, and that <p>'s parent is already available in the program, will there be any performance difference between:

document.select("p.pclass")

and

pImediateParent.select("p.pclass")

?

How does that work in Jsoup?

UPDATE: Based on the answer below, I understand that both document.select() and pImediateParent.select() use the same exact static method, just with a different root as the second parameter:

public Elements select(String query) {
    return Selector.select(query, this);
}

Which translates into:

/**
 * Find elements matching selector.
 *
 * @param query CSS selector
 * @param root  root element to descend into
 * @return matching elements, empty if not
 */
public static Elements select(String query, Element root) {
    return new Selector(query, root).select();
}

I am not surprised, but the question now is how does that query work? Does it iterate to find the queried element? Is it a random access (as in hash table) query?

도움이 되었습니까?

해결책

Yes, it will be faster if you use the intermediate parent. If you check the Jsoup source code, you'll see that Element#select() actually delegates to the Selector#select() method with the Element itself as 2nd argument. Now, the javadoc of that method says:

select

public static Elements select(String query, Element root)

Find elements matching selector.

Parameters:

  • query - CSS selector
  • root - root element to descend into

Returns:

matching elements, empty if not

Note the description of the root parameter. So yes, it definitely makes difference. Not shocking, but there is some difference.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top