Jsoup: Performance of top-level select() vs. inner-level select()

https://stackoverflow.com/questions/7829555

27-10-2019
|

문제

My understanding is that once a document is loaded into Jsoup, using Jsoup.parse(), no parsing is required again as a neatly hierarchical tree is ready for programmer's use.

But what I am not sure whether top-level select() is more costly than inner-level select().

For example, if we have a <p> buried inside many nested <div>s, and that <p>'s parent is already available in the program, will there be any performance difference between:

document.select("p.pclass")

and

pImediateParent.select("p.pclass")

How does that work in Jsoup?

UPDATE: Based on the answer below, I understand that both document.select() and pImediateParent.select() use the same exact static method, just with a different root as the second parameter:

public Elements select(String query) {
    return Selector.select(query, this);
}

Which translates into:

/**
 * Find elements matching selector.
 *
 * @param query CSS selector
 * @param root  root element to descend into
 * @return matching elements, empty if not
 */
public static Elements select(String query, Element root) {
    return new Selector(query, root).select();
}

I am not surprised, but the question now is how does that query work? Does it iterate to find the queried element? Is it a random access (as in hash table) query?

해결책

Yes, it will be faster if you use the intermediate parent. If you check the Jsoup source code, you'll see that Element#select() actually delegates to the Selector#select() method with the Element itself as 2nd argument. Now, the javadoc of that method says:

select
public static Elements select(String query, Element root)
Find elements matching selector.

Parameters:

query - CSS selector

root - root element to descend into

Returns:

matching elements, empty if not

Note the description of the root parameter. So yes, it definitely makes difference. Not shocking, but there is some difference.

라이센스 : CC-BY-SA ~와 함께 속성

제휴하지 않습니다 StackOverflow