You can extract the data using your CSS
class names or the tag names individually as well:
Elements headings = doc.select("td[class=whiteheading]");
Elements data = doc.select("td[class=copyrights]");
for (Element el : headings) {
System.out.print(el.text() + "\t\t\t");
}
System.out.println();
for (Element el : data) {
System.out.print(el.text() + "\t");
}
Gives,
Sl. No City Hospital / Nursing Home Address State
BAGALKOT KERUDI HOSPITAL & RESEARCH CENTRE EXTENSION, HOSPITAL ROAD,BAGALKOT, KARNATAKA-587101. KARNATAKA
The above code will get all the td
tag values for headings and data and put them to your console. The only problem you would have with the serial number as it does not have the CSS
class associated. Hence, the other option can be to select only on the basis of the tag name and later filter them out:
Elements data = doc.select("td");
for (Element el : data) {
System.out.print(el.text() + "\t");
}