Question

The MailAddress class doesn't provide a way to parse a string with multiple emails. The MailAddressCollection class does, but it only accepts CSV and does not allow commas inside of quotes. I am looking for a text processor to create a collection of emails from user input without these restrictions.

The processor should take comma- or semicolon-separated values in any of these formats:

"First Middle Last" <fml@example.com>
First Middle Last <fml@example.com>
fml@example.com
"Last, First" <fml@example.com>
Was it helpful?

Solution 3

After asking a related question, I became aware of a better method:

/// <summary>
/// Extracts email addresses in the following formats:
/// "Tom W. Smith" &lt;tsmith@contoso.com&gt;
/// "Smith, Tom" &lt;tsmith@contoso.com&gt;
/// Tom W. Smith &lt;tsmith@contoso.com&gt;
/// tsmith@contoso.com
/// Multiple emails can be separated by a comma or semicolon.
/// Watch out for <see cref="FormatException"/>s when enumerating.
/// </summary>
/// <param name="value">Collection of emails in the accepted formats.</param>
/// <returns>
/// A collection of <see cref="System.Net.Mail.MailAddress"/>es.
/// </returns>
/// <exception cref="ArgumentException">Thrown if the value is null, empty, or just whitespace.</exception>
public static IEnumerable<MailAddress> ExtractEmailAddresses(this string value)
{
    if (string.IsNullOrWhiteSpace(value)) throw new ArgumentException("The arg cannot be null, empty, or just whitespace.", "value");

    // Remove commas inside of quotes
    value = value.Replace(';', ',');
    var emails = value.SplitWhilePreservingQuotedValues(',');
    var mailAddresses = emails.Select(email => new MailAddress(email));
    return mailAddresses;
}

/// <summary>
/// Splits the string while preserving quoted values (i.e. instances of the delimiter character inside of quotes will not be split apart).
/// Trims leading and trailing whitespace from the individual string values.
/// Does not include empty values.
/// </summary>
/// <param name="value">The string to be split.</param>
/// <param name="delimiter">The delimiter to use to split the string, e.g. ',' for CSV.</param>
/// <returns>A collection of individual strings parsed from the original value.</returns>
public static IEnumerable<string> SplitWhilePreservingQuotedValues(this string value, char delimiter)
{
    Regex csvPreservingQuotedStrings = new Regex(string.Format("(\"[^\"]*\"|[^{0}])+", delimiter));
    var values =
        csvPreservingQuotedStrings.Matches(value)
        .Cast<Match>()
        .Select(m => m.Value.Trim())
        .Where(v => !string.IsNullOrWhiteSpace(v));
    return values;
}

This method passes the following tests:

[TestMethod]
public void ExtractEmails_SingleEmail_Matches()
{
    string value = "a@a.a";
    var expected = new List<MailAddress>
        {
            new MailAddress("a@a.a"),
        };

    var actual = value.ExtractEmailAddresses();

    CollectionAssert.AreEqual(expected, actual.ToList());
}

[TestMethod()]
public void ExtractEmails_JustEmailCSV_Matches()
{
    string value = "a@a.a; a@a.a";
    var expected = new List<MailAddress>
        {
            new MailAddress("a@a.a"),
            new MailAddress("a@a.a"),
        };

    var actual = value.ExtractEmailAddresses();

    CollectionAssert.AreEqual(expected, actual.ToList());
}

[TestMethod]
public void ExtractEmails_MultipleWordNameThenEmailSemicolonSV_Matches()
{
    string value = "a a a <a@a.a>; a a a <a@a.a>";
    var expected = new List<MailAddress>
        {
            new MailAddress("a a a <a@a.a>"),
            new MailAddress("a a a <a@a.a>"),
        };

    var actual = value.ExtractEmailAddresses();

    CollectionAssert.AreEqual(expected, actual.ToList());
}

[TestMethod]
public void ExtractEmails_JustEmailsSemicolonSV_Matches()
{
    string value = "a@a.a; a@a.a";
    var expected = new List<MailAddress>
        {
            new MailAddress("a@a.a"),
            new MailAddress("a@a.a"),
        };

    var actual = value.ExtractEmailAddresses();

    CollectionAssert.AreEqual(expected, actual.ToList());
}

[TestMethod]
public void ExtractEmails_NameInQuotesWithCommaThenEmailsCSV_Matches()
{
    string value = "\"a, a\" <a@a.a>; \"a, a\" <a@a.a>";
    var expected = new List<MailAddress>
        {
            new MailAddress("\"a, a\" <a@a.a>"),
            new MailAddress("\"a, a\" <a@a.a>"),
        };

    var actual = value.ExtractEmailAddresses();

    CollectionAssert.AreEqual(expected, actual.ToList());
}

[TestMethod]
[ExpectedException(typeof(ArgumentException))]
public void ExtractEmails_EmptyString_Throws()
{
    string value = string.Empty;

    var actual = value.ExtractEmailAddresses();
}

[TestMethod]
[ExpectedException(typeof(FormatException))]
public void ExtractEmails_NonEmailValue_ThrowsOnEnumeration()
{
    string value = "a";

    var actual = value.ExtractEmailAddresses();

    actual.ToList();
}

OTHER TIPS

The MailAddressCollection.Add() routine supports a comma delimited address list.

Dim mc As New Net.Mail.MailAddressCollection()
mc.Add("Bob <bob@bobmail.com>, mary@marymail.com, ""John Doe"" <john.doe@myemail.com>")
For Each m As Net.Mail.MailAddress In mc
    Debug.Print("{0} ({1})", m.DisplayName, m.Address)
Next

Output:

Bob (bob@bobmail.com)
(mary@marymail.com)
John Doe (john.doe@myemail.com)

The open source library DotNetOpenMail (old) has an EmailAddress class that can parse almost all legal forms of email addresses, and an EmailAddressCollection. You could start there.

Actually, MailAddressCollection DOES support comma-delimited addresses, even with the commas inside the quotes. The real problem I recently discovered, is that the CSV list must already be encoded into the ASCII character set, ie. Q-encoded or B-encoded for Unicode addresses.

There is no function in the base class libraries to perform this encoding, although I provide B-encoding in Sasa. I also just added an e-mail parsing function which addresses the question in this thread.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top