UTF-8 has nothing to do with this, you're validating a string, not a particular encoding thereof.
Your Regex actually returns true for
"äpfel@domain.com"
(with or without theCultureInvariant
option). TryConsole.Write(Regex.IsMatch("äpfel@domain.com", @"^([\w-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([\w-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$", RegexOptions.CultureInvariant));
on its own, and you gettrue
.You will fail on all IDNs like
info@ουτοπία.δπθ.gr
and if you care about non ASCiI-restricted email addresses you may want to include them. (And if you want to exclude prohibited confusables, you're getting really complicated).
There are the problems stated by others with using regular expressions to validate emails, but they boil down to:
The actual email syntax is more complicated than people think (even before we deal with the non-ASCII extensions). e.g. did you know that
Abc\@def@example.com
is a valid email address? It is, in fact it's an example of a valid address given in RFC 3696.If you go to the effort of building a perfect validator (it is possible), it'll be a waste of effort. Chances are your email software won't handle them all (e.g.
Abc\@def@example.com
above won't work with a lot of software) an then lots of valid email addresses won't actually be correct.
But anyway, I get true
running your code, the bug is elsewhere.