質問

I am currently scraping pollen data from wunderground since their API accessor doesn't offer pollen data, specifically the values attributed to each day.

I've navigated the HTML using Chrome Dev Tools and found the specific line that I want. Using the documentation offered by JSoup, I tried putting in my own custom CSS Selectors, but I am quite lost.

I was wondering if anyone would give me some insight on how to access that particular element.

For example, below is an example of what I have so far.

doc = Jsoup.connect("http://www.wunderground.com/DisplayPollen.asp?Zipcode=19104").get();
Element title = doc.getElementById("td");
Element tagName = doc.tagName("id");
System.out.println(tagName);

enter image description here

役に立ちましたか?

解決

You don't want to use doc.getElementById("td") because <td> is not id attribute, but tag (also getElementById doesn't support CSS query).

What you want is to select first <td> with class levels. You can do it via

Element tag = doc.select("td.levels").first();

Also to get only text which will be generated with this tag (and not entire HTML) use text() method like

System.out.println(tag.text());

他のヒント

Document doc = Jsoup.connect("http://www.wunderground.com/DisplayPollen.asp?Zipcode=19104").get();

Elements days = doc.select("table.pollen-table").first().select("td.even-four");
for (Element day : days) {
    System.out.println(day.text());
}


Elements levels = doc.select("td.levels");
for (Element level : levels) {
    System.out.println(level.text());
}
ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top