Question

Below is some code that is suppose to just return the source code of a page. However, this page's source code keeps coming back willed with webdings and invalid characters (hundreds of these �) . I've tried various header descriptions but none of them fixed the problem. '

string url2 = "http://mcassessor.maricopa.gov/?s=176-09-419"
HttpWebRequest request2 = (HttpWebRequest)WebRequest.Create(url2);

request2.CookieContainer = cookieJar;
request2.Method = "GET";
request2.Accept = "text/html, application/xhtml+xml, */*";
request2.Headers.Add("Accept-Language: en-US,en;q=0.5");
request2.UserAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; MAM3; rv:11.0) like Gecko";
request2.Headers.Add("Accept-Encoding: gzip, deflate");
request2.Headers.Add("X-UA-Compatible: IE=edge,chrome=1");

using (HttpWebResponse response2 = (HttpWebResponse)request2.GetResponse())
{
        string sourceCode2 = new StreamReader(request2.GetResponse().GetResponseStream()).ReadToEnd();
}
Was it helpful?

Solution

It's because it is coming back gzipped.. you're telling it to be gzipped here:

request2.Headers.Add("Accept-Encoding: gzip, deflate");

You can either remove that.. or, tell the request to decompress it:

request2.AutomaticDecompression = DecompressionMethods.GZip;
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top