Question

I am trying to send emails that contain non-ASCII characters using the SmtpClient and MailMessage classes.

I am using an external mailing service (MailChimp) and some of my emails have been rejected by their SMTP server. I have contacted them and this is what they replied:

It appears the subject line is being Base64 encoded and then Quoted-Printable encoded, which generally should be fine, but one of the characters is being broken across two lines. So when your subject lines are a bit longer, in order to be processed correctly, it's broken on to two lines. When using UTF-8 quoted printable in a subject line, character strings aren't supposed to be broken between lines. Instead a line should be shorted so that the full character string remains together. In this case, that's not happening, so the string of characters that represents a single character is being broken across multiple lines, and therefore isn't validly UTF-8 quoted-printable encoded.

The problematic subject is the following:

Subject: XXXXXXX - 5 personnes vous ont nommé guide

Which is, in UTF-8/Base64:

Subject: WFhYWFhYWCAtIDUgcGVyc29ubmVzIHZvdXMgb250IG5vbW3DqSBndWlkZQ==

Because that header would exceed a certain maximum length (I am unsure whether it is the Quoted-Printable encoding and its limit of 76 characters per line, or the SMTP header limit), after encoding and split, the header will become:

Subject: =?utf-8?B?WFhYWFhYWCAtIDUgcGVyc29ubmVzIHZvdXMgb250IG5vbW3D?=
 =?utf-8?B?qSBndWlkZQ==?=

Apparently this causes an issue when decoding (because the first line cannot be decoded to a valid string). I am not sure I fully understand the problem, and I have the following questions:

  • Why is the ?utf-8?B? part repeated? Shouldn't the QP encoding happen before splitting the line and thus its header shouldn't be repeated?
  • After QP-decoding, shouldn't we end up with a valid 1-line Base64 string?
  • There is a space at the start of the second line which is outside of the QP encoding, could this be the problem?
  • Is the encoder broken, or it is the decoder?

Also note that some other SMTP servers will accept this message, though that does not mean it is valid.

As a workaround, I have tried disabling the Base64 encoding, which apparently is unnecessary, however the MailMessage class has a BodyTransferEncoding property that controls this encoding, but only for the body part of the message. No property seems to control the "transfer" encoding of the subject.

Was it helpful?

Solution

This was confirmed as a bug in the MSDN forums:
http://social.msdn.microsoft.com/Forums/vstudio/en-US/4d1c1752-70ba-420a-9510-8fb4aa6da046/subject-encoding-on-smtpclientmailmessage

And a bug was filed on Microsoft Connect: https://connect.microsoft.com/VisualStudio/feedback/details/785710/mailmessage-subject-incorrectly-encoded-in-utf-8-base64

One work-around is to set the SubjectEncoding of the MailMessage to an other encoding, such as ISO-8859-1. In this case, the subject will be encoded in Quoted Printable (not Base64) which avoids the problem.

OTHER TIPS

A better solution is to use Encoding.Unicode instead of Encoding.UTF8 for the SubjectEncoding.

It appears that, as the Microsoft implementation simply ignores the reality of UTF-16 being able to encode characters in more than two bytes (as seen on Why does C# use UTF-16 for strings?), the stable character size helps.

I've seen this used on https://gist.github.com/dbykadorov/9047455.

My solution to this problem is some kind of trick!

I use Persian language in mail subject and I send my mail using SmtpClient in .Net framework 4.5.2. the received message subject shows some garbage words at certain positions e.g 18th and 38th character in subject string. whatever the subject is.

Then I tried inserting some spaces (character 32) in these positions and after re-sending mail the result was very good. the unicode subject was showing as expected.

so I wrote a function to insert 6 spaces in my required positions (avoiding inserting spaces within words) like this :

private static string InsertSpacesBetweenWords(this string subject , int where)
    {
        int l;
        int i=1;
        string[] s = subject.Split(new string[] { " " },  StringSplitOptions.RemoveEmptyEntries);
        string output = "";

        if (s.Length > 0) output += s[0] + " ";
        l = output.Length;
        bool done = false;

        while (i < s.Length)
        {
            if (!done)
            {
                if ((s[i] + output).Length > where)
                {
                    for (int j = output.Length; j < where + 6; j++)
                        output += " ";
                    done = true;
                }
            }
            output += s[i] + " ";
            i++;
        }
        return output;
    }

then I converted mail subject using this function :

mail.Subject = mySubject.InsertSpacesBetweenWords(38).InsertSpacesBetweenWords(18);

The interesting point is that Gmail and Yahoo mail (and possibly other web based mail systems) ignore the extra spaces and show subject as expected.

Adding two whitespaces to the subject worked for me. Don't ask why.

var mail = new MailMessage(from, to);
mail.Subject = subject + new string(new char[] { '\u2000', '\u2000' });
mail.SubjectEncoding = Encoding.UTF8;

enter image description here

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top