DownloadData() produces HTML different from the browser
-
05-07-2019 - |
Question
I'm trying to download the source HTML of a website using the WebClient.DownloadData()
method.
My method is supposed to give me the source:
public string GetSite(string URL)
{
Uri Site = new Uri(URL);
byte[] lol = Client.DownloadData(Site);
SiteSource = Encoding.ASCII.GetString(lol);
return SiteSource;
}
I've TRIPLE checked and when I write the exact same url of the URL parameter I send this method, my programs downloads something else entirely.
Pressing ctrl+U in firefox to see the source code shows me what I need to see (again, simple HTML), but in my software I see something entirely different.
What gives?
FOR CLARITY:
Imagine in Firefox you write www.google.com, viewing the source in Firefox you see:
<html>
<head>
</head>
<body>
<h1>Hello!</h1>
</body>
</html>
But if I were to use the DownloadData
method for the exact same URL, my program would download a source code like this:
<html>
<head>
</head>
<body>
<h1>Bonjour!</h1>
</body>
</html>
Solution
The site may be doing browser detection, and serving up different HTML depending on whether it perceives the client to be Firefox, IE, a Web crawler, etc.
OTHER TIPS
The site might use cookies that are set in Firefox, the User-Agent header or other HTTP headers to decide what content should be sent to you.
Since your C# program sends different data than Firefox the site might send different content.