I am currently scraping pollen data from wunderground since their API accessor doesn't offer pollen data, specifically the values attributed to each day.

I've navigated the HTML using Chrome Dev Tools and found the specific line that I want. Using the documentation offered by JSoup, I tried putting in my own custom CSS Selectors, but I am quite lost.

I was wondering if anyone would give me some insight on how to access that particular element.

For example, below is an example of what I have so far.

doc = Jsoup.connect("http://www.wunderground.com/DisplayPollen.asp?Zipcode=19104").get();
Element title = doc.getElementById("td");
Element tagName = doc.tagName("id");
System.out.println(tagName);

enter image description here

有帮助吗?

解决方案

You don't want to use doc.getElementById("td") because <td> is not id attribute, but tag (also getElementById doesn't support CSS query).

What you want is to select first <td> with class levels. You can do it via

Element tag = doc.select("td.levels").first();

Also to get only text which will be generated with this tag (and not entire HTML) use text() method like

System.out.println(tag.text());

其他提示

Document doc = Jsoup.connect("http://www.wunderground.com/DisplayPollen.asp?Zipcode=19104").get();

Elements days = doc.select("table.pollen-table").first().select("td.even-four");
for (Element day : days) {
    System.out.println(day.text());
}


Elements levels = doc.select("td.levels");
for (Element level : levels) {
    System.out.println(level.text());
}
许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top