HTML Source is different in Webclent and WebBrowser

https://stackoverflow.com/questions/22602310

19-06-2023
|

Question

I am create a C# 4.0 application to download the webpage content using Web client.

I examine the download content in C# Web client it's slightly different than the browser content (I give the same URL in Mozilla Firefox and my web client function).

The webpage shows the content correctly but my Web client DownloadString is returns another HTML.) Please see my the Web Client response below.

Webclient downloaded html

<!DOCTYPE html>
<head>
<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
<meta http-equiv="cache-control" content="max-age=0" />
<meta http-equiv="cache-control" content="no-cache" />
<meta http-equiv="expires" content="0" />
<meta http-equiv="expires" content="Tue, 01 Jan 1980 1:00:00 GMT" />
<meta http-equiv="pragma" content="no-cache" />
<meta http-equiv="refresh" content="10; url=/distil_r_captcha.html?Ref=/pgol/4-abbigliamento/3-Roma%20(RM)&distil_RID=956FEC70-B30F-11E3-A9C9-29845DBA1712" />
<script type="text/javascript" src="/ga.1550061718605.js?PID=6D4E4D1D-7094-375D-A439-0568A6A70836" defer></script><style type="text/css">#d__fFH{position:absolute;top:-5000px;left:-5000px}#d__fF{font-family:serif;font-size:200px;visibility:hidden}#electron9158f7e8,#sheltersf1491b2d,#columns375c0195,#sheltersf1491b2d{display:none!important}</style></head>
<body>
<div id="distil_ident_block">&nbsp;</div>
<div id="d__fFH"><OBJECT id="d_dlg" CLASSID="clsid:3050f819-98b5-11cf-bb82-00aa00bdce0b" width="0px" height="0px"></OBJECT><span id="d__fF"></span></div></body>
</html>

Browser META tag declaration

<meta name="robots" content="noindex,follow"/>

I am clueless. What is the reason to show different html in the WebBrowser and WebClient.

Edit

Sorry for my incomplete question. It's not a uppercase or lowercase issue.

The webpage contains a list of data, actually i want retrieve this data from downloaded HTML string. But the current situation it's not possible. because the Webclient downloaded html is retuned without this data.

But when I try to navigate the same url in browser it's shows all data correctly. What might be the reason for the difference in Webbrowser and Webclient returned content?

La solution

Well, I think it is pretty obvious that both the WebClient and your browser display and parse the web content in a different way because they have been implemented in different ways, by different programmers and different vendors.

But, the question you should be asking yourself is...does it really matter? The semantic and syntactic meaning is exactly the same. So, why do you need to bother about it? Why is it important if it is uppercase or lowercase, or if there's a space after the comma or not? It's not important

By the way, as for the specifications in html, see this quote below taken from the W3C Working Group Note

Tag names for HTML elements may be written with any mix of lowercase and uppercase letters that are a case-insensitive match for the names of the elements given in the HTML elements section of this document; that is, tag names are case-insensitive

Basically, it doesn't matter if it is lowercase or uppercase it is still HTML

Licencié sous: CC-BY-SA avec attribution

Non affilié à StackOverflow