WebBrowser Control
seems to re-arrange attributes within HTML tags when setting webBrowser1.DocumentText..
I'm wondering if there is some kind of render mode or Document Encoding that I am missing. My problem can be seen by simply adding a RichTextBoxControl
(txt_htmlBody) and a WebBrowser control (webBrowser1) to a windows form.
Add webBrowser1 WebBrowser Control, and add an event handler to; webBrowser1_DocumentCompleted
I used this to add my mouse click event to the web browser control.
private void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
// Attach an event to handle mouse clicks on the web browser
this.webBrowser1.Document.Body.MouseDown += new HtmlElementEventHandler(Body_MouseDown);
}
In the mouse click event, we get which element was clicked on like so;
private void Body_MouseDown(Object sender, HtmlElementEventArgs e)
{
// Get the clicked HTML element
HtmlElement elem = webBrowser1.Document.GetElementFromPoint(e.ClientMousePosition);
if (elem != null)
{
highLightElement(elem);
}
}
private void highLightElement(HtmlElement elem)
{
int len = this.txt_htmlBody.TextLength;
int index = 0;
string textToSearch = this.txt_htmlBody.Text.ToLower(); // convert everything in the text box to lower so we know we dont have a case sensitive issues
string textToFind = elem.OuterHtml.ToLower();
int lastIndex = textToSearch.LastIndexOf(textToFind);
// We cant find the text, because webbrowser control has re-arranged attributes in the <img> tag
// Whats rendered by web browser: "<img border=0 alt=\"\" src=\"images/promo-green2_01_04.jpg\" width=393 height=30>"
// What was passed to web browser from textbox: <img src="images/PROMO-GREEN2_01_04.jpg" width="393" height="30" border="0" alt=""/>
// As you can see, I will never be able to find my data in the source because the webBrowser has changed it
}
Add txt_htmlBody
RichTextBox
to the form, and set a TextChanged of the RichTextBox
event to set the WebBrowser1.DocumentText
as the RichTextBox
(txt_htmlBody) text changed.
private void txt_htmlBody_TextChanged(object sender, EventArgs e)
{
try
{
webBrowser1.DocumentText = txt_htmlBody.Text.Replace("\n", String.Empty);
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
}
}
When you run your program, copy the below example HTML into txt_htmlBody, and click the Image on the right and debug highLightElement. You will see by my coments why I can not find the specified text in my search string, because WebBrowser
control re-arranges the attributes.
<img src="images/PROMO-GREEN2_01_04.jpg" width="393" height="30" border="0" alt=""/>
Does anyone know how to make WebBrowser control render my HTML as-is?
Thank you for your time.