Use HtmlAgilityPack to parse HTML variable, not HTML document?

https://stackoverflow.com/questions/22636841

20-06-2023
|

Вопрос

I have a variable in my program that contains HTML data as a string. The variable, htmlText, contains something like the following:

<ul><li><u>Mode selector </u></li><li><u>LAND ALT</u></li>

I'd like to iterate through this HTML, using the HtmlAgilityPack, but every example I see tries to load the HTML as a document. I already have the HTML that I want to parse within the variable htmlText. Can someone show me how to parse this, without loading it as a document?

The example I'm looking at right now looks like this:

static void Main(string[] args)
{
    var web = new HtmlWeb();
    var doc = web.Load("http://www.stackoverflow.com");

    var nodes = doc.DocumentNode.SelectNodes("//a[@href]");

    foreach (var node in nodes)
    {
            Console.WriteLine(node.InnerHtml);
    }
}

I want to convert this to use my htmlText and find all underline elements within. I just don't want to load this as a document since I already have the HTML that I want to parse stored in a variable.

Решение

You can use the LoadHtml method of HtmlDocument class

Другие советы

Document is simply a name, it's not really a document (or doesn't have to be).

var doc = New HtmlAgilityPack.HtmlDocument;
string myHTML = "<ul><li><u>Mode selector </u></li><li><u>LAND ALT</u></li>";
doc.LoadHtml(myHTML);
foreach (var node in doc.DocumentNode.SelectNodes("//a[@href]")) {
    Console.WriteLine(node.InnerHtml);

}

I've used this exact same thing to parse html chunks in variables.

Лицензировано под: CC-BY-SA с атрибуция

Не связан с StackOverflow