Question

I have a WebBrowser control which is being instantiated dynamically from a background STA thread because the parent thread is a BackgroundWorker and has lots of other things to do.

The problem is that the Navigated event never fires, unless I pop a MessageBox.Show() in the method that told it to .Navigate(). I shall explain:

ThreadStart ts = new ThreadStart(GetLandingPageContent_ChildThread);
Thread t = new Thread(ts);
t.SetApartmentState(ApartmentState.STA);
t.Name = "Mailbox Processor";
t.Start();

protected void GetLandingPageContent_ChildThread()
{
 WebBrowser wb = new WebBrowser();
 wb.Navigated += new WebBrowserNavigatedEventHandler(wb_Navigated);
 wb.Navigate(_url);
 MessageBox.Show("W00t");
}

protected void wb_Navigated(object sender, WebBrowserNavigatedEventArgs e)
{
 WebBrowser wb = (WebBrowser)sender; // Breakpoint
 HtmlDocument hDoc = wb.Document;
}

This works fine; but the messagebox will get in the way since this is an automation app. When I remove the MessageBox.Show(), the WebBrowser.Navigated event never fires. I've tried supplanting this line with a Thread.Sleep(), and by suspending the parent thread.

Once I get this out of the way, I intend to Suspend the parent thread while the WebBrowser is doing its job and find some way of passing the resulting HTML back to the parent thread so it can continue with further logic.

Why does it do this? How can I fix it?

If someone can provide me with a way to fetch the content of a web page, fill out some data, and return the content of the page on the other side of the submit button, all against a webserver that doesn't support POST verbs nor passing data via QueryString, I'll also accept that answer as this whole exercise will have been unneccessary.


Solution: I ended up just not using the BackgroundWorker and slave thread at all at the suggestion of the team architect... Though at the expense of responsiveness :(

Was it helpful?

Solution

WebBrowser won't do much unless it is shown and has a UI thread associated; are you showing the form on which it resides? You need to, to use the DOM etc. The form could be off-screen if you don't want to display it to the user, but it won't work well in a service (for example).

For scraping purposes, you can normally simulate a regular HTML browwser using WebClient etc. IS this not sufficient? You can use tools like "Fiddler" to investigate the exact request you need to make to the server. For more than that, you might look at the HTML Agility Pack, which offers DOM access to HTML without a browser.

OTHER TIPS

The Navigated and DocumentComplete events will not fire if the visibility of the WebBrowser is set to false. You can work around this limitation by making the WebBrowser visible but setting it's location so that it is outside of the user interface like:

wb.Visible = true;
wb.Left = -wb.Width; // notice the minus sign

you need to add a line that's like this:

webBrowser1.Navigated += new WebBrowserNavigatedEventHandler(webBrowser1_Navigated);

where webBrowswer1_Navigated is the function you want called when the event fires.

Is there a GUI thread already started? Perhaps the WebBrowser object uses a GUI thread to handle events. In that case, you should call Application.Run() from the thread that creates the WebBrowser (replace your MessageBox.Show() with this). Application.Run() will hang until Application.Exit() is called.

Trying to test this now.

I ended up just not using the BackgroundWorker and slave thread at all at the suggestion of the team architect... Though at the expense of responsiveness :(

A WebBrowser control can't work if it is not in a STA Thread. If you want to use a WebBrowser instance in a thread you need to create your thread and call Thread.SetApartmentState(ApartmentState.STA);

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top