Question

I looked and searched and couldn't find what I needed although I think it should be simple (if you have any Python experience, which I don't).

Given a string, I want to verify, in Python, that it contains ONLY alphanumeric characters: a-zA-Z0-9 and . _ -

examples:

Accepted:

bill-gates

Steve_Jobs

Micro.soft

Rejected:

Bill gates -- no spaces allowed

me@host.com -- @ is not alphanumeric

I'm trying to use:

if re.match("^[a-zA-Z0-9_.-]+$", username) == True:

But that doesn't seem to do the job...

Was it helpful?

Solution

re.match does not return a boolean; it returns a MatchObject on a match, or None on a non-match.

>>> re.match("^[a-zA-Z0-9_.-]+$", "hello")
<_sre.SRE_Match object at 0xb7600250>
>>> re.match("^[a-zA-Z0-9_.-]+$", "    ")
>>> print re.match("^[a-zA-Z0-9_.-]+$", "    ")
None

So, you shouldn't do re.match(...) == True; rather, you should be checking re.match(...) is not None in this case, which can be further shortened to just if re.match(...).

OTHER TIPS

Never use == True or == False in a comparison. Many types already have a bool equivalent which you should use instead:

if re.match("^[a-zA-Z0-9_.-]+$", username):

Could also shorten it slightly to :

if re.match(r'^[\w.-]+$', username):

I would consider this for a valid username:
1) Username must be 6-30 characters long
2) Username may only contain:

  • Uppercase and lowercase letters
  • Numbers from 0-9 and
  • Special characters _ - .

3) Username may not:

  • Begin or finish with characters _ - .

  • Have more than one sequential character _ - . inside

This would be example of usage:
if re.match(r'^(?![-._])(?!.*[_.-]{2})[\w.-]{6,30}(?<![-._])$',username) is not None:

I do my validation this way in my utils class:

def valid_re(self, s, r):
 reg = re.compile(r)
 return reg.match(s)

Then I call the utils instance, and check this way:

if not utils.valid_re(username, r'^[a-zA-Z0-9_.-]+$'):
        error = "Invalid username!"

If you are going to use many regular expressions you can compile it for speed (or readability)

import re 
ALPHANUM=re.compile('^[a-zA-Z0-9_.-]+$')

for u in users:
    if ALPHANUM.match(u) is None:
        print "invalid"

From the docs:

The compiled versions of the most recent patterns passed to re.match(), re.search() or re.compile() are cached, so programs that use only a few regular expressions at a time needn’t worry about compiling regular expressions.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top