Question

I am downloading a webpage (http://library.islamweb.net/hadith/RawyDetails.php?RawyID=1), it contains some arabic, which when viewed with the "View Source" option on a browser (chrome/IE) looks fine:

<span lang="ar-qa">رقم الراوي</span>

However when downloaded it looks like :

<span lang="ar-qa">ÑÞã ÇáÑÇæí</span>

My code is very simple:

client.DownloadFile(_webPath, savePath);

What is wrong?

Was it helpful?

Solution

Your Page's encoding char set is "windows-1256" , so you need to read it using that encoding:

private void GetRepliesStats_Load(object sender, EventArgs e)
        {
            WebBrowser bro = new WebBrowser();
            bro.Navigate("http://library.islamweb.net/hadith/RawyDetails.php?RawyID=1");
            bro.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(BrowsingCompleted);


        }

private void BrowsingCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
            {
                WebBrowser browser = sender as WebBrowser;

                Stream documentStream = browser.DocumentStream;
                StreamReader streamReader = new StreamReader(documentStream, Encoding.GetEncoding("windows-1256"));

                documentStream.Position = 0L;
                String My_Result = streamReader.ReadToEnd();


}

I Hope this helps.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top