문제

In this HTML Source,

<strong>Apple</strong> <span id="apple">Red</span>
<strong>Orange</strong> <span id="orange">Orange</span>
<strong>Beans</strong> <span id="beans">Green</span>
<strong>Carrot</strong> <span id="carrot">Orange</span>
<strong>Banana</strong> <span id="banana">Yellow</span>
<strong>Grapes</strong> <span id="grape">Green</span>

I am trying to use HTML Agility Pack and retrieve the beans and carrot colors (Green, Orange) which is the inner text of span id beans and carrot respectively

using code

string beans= document.DocumentNode.Descendants("span")
        .Where(node => node.Attributes["id"] != null && node.Attributes["id"].Value == "beans")
        .ToArray().ElementAt(0).InnerText.Trim();

string carrot = document.DocumentNode.Descendants("span")
        .Where(node => node.Attributes["id"] != null && node.Attributes["id"].Value == "carrot")
        .ToArray().ElementAt(0).InnerText.Trim();

But it takes more time as the code searches the SPAN nodes twice. I want to know if there is any other way to access the Particular SPAN element which is more efficient.

Without this code, the source code is loaded to the document via Agility Pack faster. After adding this piece of code has delayed the process.

And incase if the HTML Source does not have the Specific ID it gives an Exception.

I want to save both the Vegetable Colors (Green and Orange) separately in two variables as I will be converting it to comma delimited txt file using LIST.

도움이 되었습니까?

해결책

I would try to insert the spans into a dictionary, assuming that the span IDs are unique:

Dictionary<string, HtmlNode> spans = document.DocumentNode.Descendants("span")
    .Where(node => node.Attributes["id"] != null)
    .ToDictionary(node => node.Attributes["id"].Value);

Now you can get the spans quickly with:

HtmlNode span;
if (spans.TryGetValue("apple", out span)) {
    string text = span.InnerText.Trim();
}

Or getting the inner text directly:

Dictionary<string, string> texts = document.DocumentNode.Descendants("span")
    .Where(node => node.Attributes["id"] != null)
    .ToDictionary(node => node.Attributes["id"].Value,
                  node => node.InnerText.Trim());

Now you can get the texts quickly with:

string text;
if (texts.TryGetValue("apple", out text)) {
    Console.WriteLine(text);
}

Or if you are sure that the span IDs exist:

string apple = texts["apple"];
string orange = texts["orange"];
라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top