Jsoup: Performance of top-level select() vs. inner-level select()
-
27-10-2019 - |
문제
My understanding is that once a document is loaded into Jsoup, using Jsoup.parse()
, no parsing is required again as a neatly hierarchical tree is ready for programmer's use.
But what I am not sure whether top-level select() is more costly than inner-level select().
For example, if we have a <p>
buried inside many nested <div>
s, and that <p>
's parent is already available in the program, will there be any performance difference between:
document.select("p.pclass")
and
pImediateParent.select("p.pclass")
?
How does that work in Jsoup?
UPDATE: Based on the answer below, I understand that both document.select()
and pImediateParent.select()
use the same exact static method, just with a different root as the second parameter:
public Elements select(String query) {
return Selector.select(query, this);
}
Which translates into:
/**
* Find elements matching selector.
*
* @param query CSS selector
* @param root root element to descend into
* @return matching elements, empty if not
*/
public static Elements select(String query, Element root) {
return new Selector(query, root).select();
}
I am not surprised, but the question now is how does that query
work? Does it iterate to find the queried element? Is it a random access (as in hash table) query?
해결책
Yes, it will be faster if you use the intermediate parent. If you check the Jsoup source code, you'll see that Element#select()
actually delegates to the Selector#select()
method with the Element
itself as 2nd argument. Now, the javadoc of that method says:
select
public static Elements select(String query, Element root)
Find elements matching selector.
Parameters:
- query - CSS selector
- root - root element to descend into
Returns:
matching elements, empty if not
Note the description of the root
parameter. So yes, it definitely makes difference. Not shocking, but there is some difference.