Frage

I was getting page source code with

          Uri url = new Uri(urlAddress);
        WebClient client = new WebClient();
        client.Encoding = System.Text.Encoding.UTF8;
        string htlm = client.DownloadString(url);   

but it gives character issue at kickass.to (torrrent site) even though it writes

     "meta http-equiv="Content-Type" content="text/html; charset=utf-8""                            

at the source code.

also tried this method http://www.tech-recipes.com/rx/1954/get_web_page_contents_in_code_with_csharp/ to get source code which didnt work

example source code: http://pastebin.com/ycBjWLRi

How can I get source code properly?

War es hilfreich?

Lösung

I noticed something about forcing character encoding in a recent article I read over at:

It says you should set it up like this:

HtmlWeb htmlWeb = new HtmlWeb() { 
  AutoDetectEncoding = false, 
  OverrideEncoding = Encoding.GetEncoding("iso-8859-2") 
};

This is using Html Agility Pack which you have tagged your question with but you dont seem to have actually used it in your code example above or in the article you linked out to on tech-recipes.com.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top