Question

I'm trying to pull the data from the tables on the page http://www.pgatour.com/players/player.24502.adam-scott.html/season

But the Xpath I copy from chrome is returning a null reference. I've tried several variations but nothing is working, I've never used XPath before, am I missing something?

string Url = "http://www.pgatour.com/players/player.24502.adam-scott.html/season";
HtmlWeb web = new HtmlWeb();
HtmlDocument doc = web.Load(Url);

var firstTournamentDate = doc.DocumentNode.SelectNodes("//*[@id='player-season-details-table']/div/table/tr[2]/td[1]");

Note: I've removed the tbody axis step from the XPath that chrome produced.

Edit:

the firstTournamentDate variable is null then if I try this:

var x = doc.DocumentNode.SelectNodes("//*[@id='player-season-details-table']/div/table/tr[2]/td[1]", index)[0].InnerText;

it throws a null reference exception

Était-ce utile?

La solution

The data is loaded dynamically using AJAX. You cannot simply access it with an XPath expression without executing the JavaScript, for example using Selenium.

But using Firebug/Chrome Dev Tools and monitoring the "Network" tab, you can try to find out which URL is requested. I think you're looking for

http://www.pgatour.com/data/players/24502/2014results.json

which returns the table content as easily to parse JSON objects.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top