문제

I am currently scraping pollen data from wunderground since their API accessor doesn't offer pollen data, specifically the values attributed to each day.

I've navigated the HTML using Chrome Dev Tools and found the specific line that I want. Using the documentation offered by JSoup, I tried putting in my own custom CSS Selectors, but I am quite lost.

I was wondering if anyone would give me some insight on how to access that particular element.

For example, below is an example of what I have so far.

doc = Jsoup.connect("http://www.wunderground.com/DisplayPollen.asp?Zipcode=19104").get();
Element title = doc.getElementById("td");
Element tagName = doc.tagName("id");
System.out.println(tagName);

enter image description here

도움이 되었습니까?

해결책

You don't want to use doc.getElementById("td") because <td> is not id attribute, but tag (also getElementById doesn't support CSS query).

What you want is to select first <td> with class levels. You can do it via

Element tag = doc.select("td.levels").first();

Also to get only text which will be generated with this tag (and not entire HTML) use text() method like

System.out.println(tag.text());

다른 팁

Document doc = Jsoup.connect("http://www.wunderground.com/DisplayPollen.asp?Zipcode=19104").get();

Elements days = doc.select("table.pollen-table").first().select("td.even-four");
for (Element day : days) {
    System.out.println(day.text());
}


Elements levels = doc.select("td.levels");
for (Element level : levels) {
    System.out.println(level.text());
}
라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top