Best practices for email address validation (including the + in gmail addresses)

https://stackoverflow.com/questions/1874104

18-09-2019
|

Question

I know there are a lot of questions on here about email validation and specific RegEx's. I'd like to know what the best practices are for validating emails with respect to having the username+anythingelse@gmail.com trick (details here). My current RegExp for JavaScript validation is as follows, but it doesn't support the extra + in the handle:

/^([a-zA-Z0-9_.-])+@(([a-zA-Z0-9-])+.)+([a-zA-Z0-9]{2,4})+$/

Are there any other services that support the extra +? Should I allow a + in the address or should I alter the RegEx to only allow it for an email with gmail.com or googlemail.com as the domain? If so, what would be the altered RegEx?

UPDATE: Thanks to everyone for pointing out that + is valid per the spec. I didn't know that and now do for the future. For those of you saying that its bad to even use a RegEx to validate it, my reason is completely based on a creative design I'm building to. Our client's design places a green check or a red X next to the email address input on blur of it. That icon indicates whether or not its a valid email address so I must use some JS to validate it then.

Solution

+ is a valid character in an email address. It doesn't matter if the domain isn't gmail.com or googlemail.com

Regexes aren't actually a very good way of validating emails, but if you just want to modify your regex to handle the plus, change it to the following:

/^([a-zA-Z0-9_.-\+])+@(([a-zA-Z0-9-])+.)+([a-zA-Z0-9]{2,4})+$/

As an example of how this regex doesn't validate against the spec: The email ..@-.com is valid according to it.

OTHER TIPS

If you need to validate emails via regexp, then read the standard or at least this article.

The standard suggests to use this regexp:

(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])

If that doesn't scare you, it should :)

I would tend to go with something along the lines of /.+@.+\..+/ to check for simple mistakes. Then I would send an email to the address to verify that it actually exists, since most typos will still result in syntactically valid email addresses.

The specs allow for some really crazy ugly email addresses. I'm often very annoyed by websites even complaining about perfectly normal, valid email addresses, so please, try not to reject valid email addresses. It's better to accept some illegal addresses than to reject legal ones.

Like others have suggested, I'd go with using a simple regexp like /.+@.+/ and then sending a verification email. If it's important enough to validate, it's important enough to verify, because a legal email address can still belong to someone other than your visitor. Or contain an unintended but fatal typo.

*Edit: removed the dot from the domain part of the regex, because a@to is still a valid email address. So even my super simplified validation rejected valid addresses. Is there any downside at all to just accepting everything that contains an @ with something in front and behind it?

A very good article about this subject I Knew How To Validate An Email Address Until I Read The RFC

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow