Question

I have a text file of e-mails like this:

10:info@example.com;dev@example.com
12:john@host.com; "George <g.top@host.com>" 
43:jim.p@web.com.;sue-allen@web.com
...

I want to check whether the list contains well formatted entries. Do you know any tool or web-service to check and give me a list of invalid addresses?

Update Dear all, thank you for your input. I was really looking for a basic syntax check, so I will stay with Rafe's idea (I will do it with Java).

Was it helpful?

Solution

I wrote a simple Perl script that uses the Email::Address module to validate these addresses:

#!/usr/bin/env perl

use Email::Address;

while (<>) {
    chomp;
    @addresses = split /\;/;

    foreach my $address (@addresses) {
        if (!Email::Address->parse($address)) {
            print $address, "\n";
        }
    }
}

You'll just need to install the module. Its home page is:

http://emailproject.perl.org/wiki/Email::Address

OTHER TIPS

Read this so you are doing it the RFC compliant way:

http://www.eph.co.uk/resources/email-address-length-faq/

Probably the simplest way to validate an email is to send a message to it. As Sean points out this can leave you open to DoS attacks, but from your description it seems you have a text file rather than a web page, so this shouldn't be a problem.

Regular expressions are not a good tool for matching emails, there are a lot of valid addresses that naive matching will fail. Check out this comparison of attempts to validate emails with regex for details.

If you have to check them offline, I would split the email into parts (i.e. the parts before the @ and after the @), you could then create a custom validator (or regex) to validate those parts.

Email validation is not as simple as a regular expression

First, I would read this article I Knew How To Validate An Email Address Until I Read The RFC.

Back in the days of yore, you could just connect to the user's mail server and use the VRFY command and verify that an email address was valid, but spammers abused that privilege and we all lost out.

Now, I would recommend a three part approach:

  1. Verify the syntactic validity. You can use the monster regex from the Mail perl module to check to make sure that the email address is well formed. Then make sure to blacklist localhost domains/ips as part of your check.

  2. Verify that the domain is live. Do a DNS validation check on the domain. You could take this one step further and use a STMP check and make sure that you can connect to a valid mailserver for the domain. However, there may be some false negative results due to virtual hosting schemes.

  3. Send an actual email, but include a single image that links to a script on your server. When the email is read with the image, your server will be notified that the image was download and hence the email is alive and valid. However, nowadays many email clients do not load images by default for this very reason, so it won't be 100% effective.

Resources

  1. Validating Email Addresses in ASP (online)
  2. Validating Email Addresses in PHP (code examples)
  3. This commercial product does bulk email verification ← This is probably what you are looking for
  4. SO Question: How to check if an email address exists without sending an-email

This problem is harder than it appears. When faced with it, I stole the code from the mf.c module in the NMH sources. I then imported the address parser into Lua so I could handle email addresses from scripts.

Using somebody else's code saved me a world of pain.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top