Question

Before starting writing this question, i was trying to solve following

// 1. navigate to page
// 2. wait until page is downloaded
// 3. read and write some data from/to iframe 
// 4. submit (post) form

The problem was, that if a iframe exists on a web page, DocumentCompleted event would get fired more then once (after each document has been completed). It was highly likely that program would have tried to read data from DOM that was not completed and naturally - fail.

But suddenly while writing this question 'What if' monster inspired me, and i fix'ed the problem, that i was trying to solve. As i failed Google'ing this, i thought it would be nice to post it here.

    private int iframe_counter = 1; // needs to be 1, to pass DCF test
    public bool isLazyMan = default(bool);

    /// <summary>
    /// LOCK to stop inspecting DOM before DCF
    /// </summary>
    public void waitPolice() {
        while (isLazyMan) Application.DoEvents();
    }

    private void webBrowser1_Navigating(object sender, WebBrowserNavigatingEventArgs e) {
        if(!e.TargetFrameName.Equals(""))
            iframe_counter --;
        isLazyMan = true;
    }

    private void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e) {
        if (!((WebBrowser)sender).Document.Url.Equals(e.Url))
            iframe_counter++;
        if (((WebBrowser)sender).Document.Window.Frames.Count <= iframe_counter) {//DCF test
            DocumentCompletedFully((WebBrowser)sender,e);
            isLazyMan = false; 
        }
    }

    private void DocumentCompletedFully(WebBrowser sender, WebBrowserDocumentCompletedEventArgs e){
        //code here
    }

For now at least, my 5m hack seems to be working fine.

Maybe i am really failing at querying google or MSDN, but i can not find: "How to use webbrowser control DocumentCompleted event in C# ?"

Remark: After learning a lot about webcontrol, I found that it does FuNKY stuff.

Even if you detect that the document has completed, in most cases it wont stay like that forever. Page update can be done in several ways - frame refresh, ajax like request or server side push (you need to have some control that supports asynchronous communication and has html or JavaScript interop). Also some iframes will never load, so it's not best idea to wait for them forever.

I ended up using:

if (e.Url != wb.Url)
Was it helpful?

Solution

You might want to know the AJAX calls as well.

Consider using this:

private void webBrowser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
    string url = e.Url.ToString();
    if (!(url.StartsWith("http://") || url.StartsWith("https://")))
    {
            // in AJAX
    }

    if (e.Url.AbsolutePath != this.webBrowser.Url.AbsolutePath)
    {
            // IFRAME 
    }
    else
    {
            // REAL DOCUMENT COMPLETE
    }
}

OTHER TIPS

I have yet to find a working solution to this problem online. Hopefully this will make it to the top and save everyone the months of tweaking I spent trying to solve it, and the edge cases associated with it. I have fought over this issue over the years as Microsoft has changed the implementation/reliability of isBusy and document.readystate. With IE8, I had to resort to the following solution. It's similar to the question/answer from Margus with a few exceptions. My code will handle nested frames, javascript/ajax requests and meta-redirects. I have simplified the code for clarity sake, but I also use a timeout function (not included) to reset the webpage after if 5 minutes domAccess still equals false.

private void m_WebBrowser_BeforeNavigate(object pDisp, ref object URL, ref object Flags, ref object TargetFrameName, ref object PostData, ref object Headers, ref bool Cancel)
{
    //Javascript Events Trigger a Before Navigate Twice, but the first event 
    //will contain javascript: in the URL so we can ignore it.
    if (!URL.ToString().ToUpper().StartsWith("JAVASCRIPT:"))
    {
        //indicate the dom is not available
        this.domAccess = false;
        this.activeRequests.Add(URL);
    }
}

private void m_WebBrowser_DocumentComplete(object pDisp, ref object URL) 
{

    this.activeRequests.RemoveAt(0);

    //if pDisp Matches the main activex instance then we are done.
    if (pDisp.Equals((SHDocVw.WebBrowser)m_WebBrowser.ActiveXInstance)) 
    {
        //Top Window has finished rendering 
        //Since it will always render last, clear the active requests.
        //This solves Meta Redirects causing out of sync request counts
        this.activeRequests.Clear();
    }
    else if (m_WebBrowser.Document != null)
    {
        //Some iframe completed dom render
    }

    //Record the final complete URL for reference
    if (this.activeRequests.Count == 0)
    {
        //Finished downloading page - dom access ready
        this.domAccess = true;
    }
}

Unlike Thorsten I didn't have to use ShDocVw, but what did make the difference for me was adding the loop checking ReadyState and using Application.DoEvents() while not ready. Here is my code:

        this.webBrowser.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(WebBrowser_DocumentCompleted);
        foreach (var item in this.urlList) // This is a Dictionary<string, string>
        {
            this.webBrowser.Navigate(item.Value);
            while (this.webBrowser1.ReadyState != WebBrowserReadyState.Complete)
            {
                Application.DoEvents();
            }
        }

And I used Yuki's solution for checking the results of WebBrowser_DocumentCompleted, though with the last if/else swapped per user's comment:

     private void WebBrowser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
    {
        string url = e.Url.ToString();
        var browser = (WebBrowser)sender;

        if (!(url.StartsWith("http://") || url.StartsWith("https://")))     
        {             
            // in AJAX     
        }
        if (e.Url.AbsolutePath != this.webBrowser.Url.AbsolutePath)     
        {
            // IFRAME           
        }     
        else     
        {             
            // REAL DOCUMENT COMPLETE
            // Put my code here
        }
    }

Worked like a charm :)

I had to do something similar. What I do is use ShDocVw directly (adding a reference to all the necessary interop assemblies to my project). Then, I do not add the WebBrowser control to my form, but the AXShDocVw.AxWebBrowser control.

To navigate and wait I use to following method:

private void GotoUrlAndWait(AxWebBrowser wb, string url)
{
    object dummy = null;
    wb.Navigate(url, ref dummy, ref dummy, ref dummy, ref dummy);

    // Wait for the control the be initialized and ready.
    while (wb.ReadyState != SHDocVw.tagREADYSTATE.READYSTATE_COMPLETE)
        Application.DoEvents();
}

Just thought to drop a line or two here about a small improvement which works in conjunction with the code of FeiBao. The idea is to inject a landmark (javascript) variable in the webpage and use that to detect which of the subsequent DocumentComplete events is the real deal. I doubt it's bulletproof but it has worked more reliably in general than the approach that lacks it. Any comments welcome. Here is the boilerplate code:

 void WebBrowser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
    {
        string url = e.Url.ToString();
        var browser = (WebBrowser)sender;

        if (!(url.StartsWith("http://") || url.StartsWith("https://")))
        {
            // in AJAX     
        }
        if (e.Url.AbsolutePath != this.webBrowser.Url.AbsolutePath)
        {
            // IFRAME           
        }
        else if (browser.Document != null && (bool)browser.Document.InvokeScript("eval", new object[] { @"typeof window.YourLandMarkJavascriptVariableHere === 'undefined'" }))
        {
            ((IHTMLWindow2)browser.Document.Window.DomWindow).execScript("var window.YourLandMarkJavascriptVariableHere = true;");

            // REAL DOCUMENT COMPLETE
            // Put my code here
        }
    }
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top