I have to convert the content of a mail message to XML format but I am facing some encoding problems. Indeed, all my accented characters and some others are displayed in the message file with their hex value. Ex :

é is displayed =E9,
ô is displayed =F4,
= is displayed =3D...

The mail is configured to be sent with iso-8859-1 coding and I can see these parameters in the file :

Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Notepad++ detects the file as "ANSI as UTF-8".

I need to convert it in C# (I am in a script task in an SSIS project) to be readable and I can not manage to do that.

I tried encoding it in UTF-8 in my StreamReader but it does nothing. Despite my readings on the topic, I still do not really understand the steps that lead to my problem and the means to solve it.

I point out that Outlook decodes the message well and the accented characters are displayed correctly.

Thanks in advance.

有帮助吗?

解决方案

Ok I was looking on the wrong direction. The keyword here is "Quoted-Printable". This is where my issue comes from and this is what I really have to decode.

In order to do that, I followed the example posted by Martin Murphy in this thread :

C#: Class for decoding Quoted-Printable encoding?

The method described is :

public static string DecodeQuotedPrintables(string input)
{
    var occurences = new Regex(@"=[0-9A-F]{2}", RegexOptions.Multiline);
    var matches = occurences.Matches(input);
    foreach (Match match in matches)
    {
        char hexChar= (char) Convert.ToInt32(match.Groups[0].Value.Substring(1), 16);
        input =input.Replace(match.Groups[0].Value, hexChar.ToString());
    }
    return input.Replace("=\r\n", "");
}

To summarize, I open a StreamReader in UTF8 and place each read line in a string like that :

myString += line + "\r\n";

I open then my StreamWriter in UTF8 too and write the myString variable decoded in it :

myStreamWriter.WriteLine(DecodeQuotedPrintables(myString));
许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top