質問

I'm building an app that crawls OkCupid matches. Their match result contains Html that looks like this.

<div id="match_results">
    <div>person1</div>
    <div>person2</div>
    <div>person3</div>
</div>

I want to do a foreach person's div inside the div match_results. However, something's not quite right with my C# code. matchesList only contains one element (itself? and not all the divs inside it...)

HtmlDocument matchesHtmlDoc = new HtmlDocument();
matchesHtmlDoc.LoadHtml(matches);

string matchResultDivId = "match_results";

// match results
HtmlNodeCollection matchesList = matchesHtmlDoc.DocumentNode.SelectNodes("//div[@id = '" + matchResultDivId + "']");

foreach (HtmlNode match in matchesList)
{
    //test
    Console.WriteLine(match.ToString());
}
役に立ちましたか?

解決

You forgot to select child divs:

HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(matches);

string matchResultDivId = "match_results";
string xpath = String.Format("//div[@id='{0}']/div", matchResultDivId);
var people = doc.DocumentNode.SelectNodes(xpath).Select(p => p.InnerText);

foreach(var person in people)
    Console.WriteLine(person);

Output:

person1
person2
person3
ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top