The code logic is a bit off. With the original code, if test
evaluated true
the loop will never terminates. It seems that you want to do checking in every loop iteration instead of only once at the beginning.
Anyway, there is a better way around. You can select all relevant nodes without specifying each <tr>
indices, and use foreach
to loop through the node set :
var nodes = doc.DocumentNode.SelectNodes("//*[@id='t1']/tr/td[3]/a[2]");
foreach(HtmlNode node in nodes)
{
string name = node.InnerText;
//extract data
}
or using for
loop instead of foreach
, if index of each node is necessary for the "extract data" process :
for(i=1; i<=nodes.Count; i++)
{
//array index starts from 0, unlike XPath element index
string name = nodes[i-1].InnerText;
//extract data
}
Side note : To query single element you can use SelectSingleNode("...")
instead of SelectNodes("...")[0]
. Both methods return null
if no nodes match XPath criteria, so you can do checking against the original value returned instead of against InnerText
property to avoid exception :
var node = doc.DocumentNode.SelectSingleNode("...");
if(node != null)
{
//do something
}