Question

First of all, I am pretty much still a beginner, especially when it comes to web stuff.

I am trying to read the content of a text box from a web page that is open in a browser with my winforms application and I am not able to modify the source code of the web page itself. Sadly, the string I am looking for is not simply written in the source code of the page. So I can't just read the page source and parse it. It seems as if the content of the textbox is populated via javascript.

I am generally speaking not sure where to even start here. Any suggestions are very welcome.

Also, I am not sure what other information I should put here. I don't have an idea where to start, so I don't have any code yet to show.

Edit:

I have been trying to use the agility pack, but I am still not sure how to get to what I need. Here is my code so far:

WebClient client = new WebClient();
String html = client.DownloadString(URL);
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);

foreach (HtmlNode link in doc.DocumentNode.SelectNodes("//div[@class='ember-view']"))
{
    HtmlAttribute div = link.Attributes["div"];
    if (div != null)
    {
                outputBox.Text += div.Value;
    }
}

When I run the code, I get this:

An unhandled exception of type 'System.NullReferenceException' occurred. Additional information: Object reference not set to an instance of an object.

When I go to the web page and do Inspect Element I get this (I only copied a few lines):

<html class="no-js" lang="en">

<head></head>
<body class="ember-application" lang="en-US" data-environment="production">
    <div id="booting" style="display: none;"></div>
    <div id="ember2493" class="ember-view">
        <div id="alert" class="ember-view"></div>

I am not sure how to get to, let's say, the inner code of this line:

<div id="alert" class="ember-view"></div>

Also, my apologies if this is something obvious that I am missing, but again, this is all new for me. Thanks for the help so far.

No correct solution

OTHER TIPS

Do you know Html Agility Pack? I always using agility pack for html crawling.

 HtmlDocument doc = new HtmlDocument();
 doc.Load("file.htm");
 foreach(HtmlNode link in doc.DocumentElement.SelectNodes("//a[@href"])
 {
    HtmlAttribute att = link["href"];
    att.Value = FixLink(att);
 }
 doc.Save("file.htm");

Perhaps something along the following lines may help ?

        var inputs = webBrowser1.Document.GetElementsByTagName("input");
        foreach (HtmlElement input in inputs)
        {
            var id = input.Id;
            var name = input.Name;
            var val = input.OuterHtml;  // can parse value from here
        }
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top