Question

Assuming I have an SMTP/IMAP/POP3 login URL like this:

smtp://foobar@example.com:abc@smtp.example.com:465

I want to replace the password (abc in this case) by a constant number of stars (e.g. *****) in order to hide it from users.

What I tried so far heavily uses lookarounds:

def starPassword(route):
    """
    >>> starPassword("smtp://foobar@example.com:abc@smtp.example.com:465")
    'smtp://foobar@example.com:*****@smtp.example.com:465'
    >>> starPassword("smtp://foobar:abc@smtp.example.com:25")
    'smtp://foobar:*****@smtp.example.com:465'
    """
    # Regex explanation:
    #  (?<=\w+://\w+:) matches the colon before the password without consuming 
    #  ([^@]+) matches the password (TODO use a better match, passwords might contain @! Check escaping)
    #  (?=@[^@]+$) matches the @ after the server, plus the rest of the URL
    return re.sub("(?<=:)([^@]+)(?=@[^@]+$)", "*****", route)
if __name__ == "__main__":
    import doctest
    doctest.testmod()

Unfortunately, this regex has several problems, including:

  • The first unit test succeeds, but the second doesn't, because the protocol (smtp:// colon is matched). I tried (?<=\w+://\w+:), but lookbehinds need to be custom length. Maybe I can consume those URL parts and replace by something like \1*****\2) or similar?
  • Passwords containing @ and/or : won't be recognized, I'm not even sure of how they are escaped (this is why I don't use the non-greedy flag)

Note that I can't use Python3 (urlparse module) -- also I don't want to use third-party libraries unless strictly neccessary.

Thanks in advance for pointing me in the right direction.

Was it helpful?

Solution

You can use the urlparse.urlsplit() function (which is also available in Python 2); the .netloc parameter would contain the username and password (which both would be escaped to not contain plain : or @ characters, see RFC 3986 Section 3.2.1):

import urlparse

def starPassword(route):
    parsed = urlparse.urlsplit(route)
    if '@' not in parsed.netloc:
        return route

    userinfo, _, location = parsed.netloc.partition('@')
    username, _, password = userinfo.partition(':')
    if not password:
        return route

    userinfo = ':'.join([username, '*****'])
    netloc = '@'.join([userinfo, location])
    parsed = parsed._replace(netloc=netloc)
    return urlparse.urlunsplit(parsed)

Demo:

>>> starPassword('smtp://foobar%40example.com:abc@smtp.example.com:465')
'smtp://foobar%40example.com:*****@smtp.example.com:465'
>>> starPassword('smtp://foobar:abc@smtp.example.com:25')
'smtp://foobar:*****@smtp.example.com:25'
>>> starPassword('smtp://smtp.example.com:1234')
'smtp://smtp.example.com:1234'
>>> starPassword('smtp://foo@smtp.example.com:42')
'smtp://foo@smtp.example.com:42'

OTHER TIPS

Use this regular expression:

(?<=:)([^@:]+)(?=@[^@]+$)

I added : to [^@]. Hence, this regular expression will match the string between : and @ without any : or @ in-between.

print( re.sub("(?<=:)([^@:]+)(?=@[^@]+$)", "*****",
              "smtp://foobar:abc@smtp.example.com:25") )

smtp://foobar:*****@smtp.example.com:25
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top