Question

Python has a terrific urlencode() function which encodes a dict via RFC 1738 (Plus encoding):

>>> urllib.parse.urlencode({'site':'Stack Overflow','Coder':'Jeff Atwood'})
'Coder=Jeff+Atwood&site=Stack+Overflow'

I cannot find a replacement that uses RFC 3986 (Percent encoding), even though the fine manual states the following:

RFC 3986 - Uniform Resource Identifiers
This is the current standard (STD66). Any changes to urllib.parse module should conform to this.

This would be the expected output:

>>> urllib.parse.urlencode({'site':'Stack Overflow','Coder':'Jeff Atwood'})
'Coder=Jeff%20Atwood&site=Stack%20Overflow'

Of course I could roll my own, but I find it surprising that I can find no such Python function built in. Is there such a Python function that I'm just not finding?

Was it helpful?

Solution

It seems there is no such thing built in, but there is a bug requesting one, and it even has a patch attached: http://bugs.python.org/issue13866

OTHER TIPS

For strings you can use this:

def percent_encoding(string):
    result = ''
    accepted = [c for c in 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789-._~'.encode('utf-8')]
    for char in string.encode('utf-8'):
        result += chr(char) if char in accepted else '%{}'.format(hex(char)[2:]).upper()
    return result

>>> percent_encoding('http://www.google.com')
'http%3A%2F%2Fwww.google.com'

>>> percent_encoding('ñapa')
'%C3%B1apa'

And now, for a dictionary, you need to encode the values, so you only need a function that translate this dictionary to url key/value pairs, encoding only its values.

def percent_urlencode(dictionary):
    return '&'.join(["{}={}".format(k, percent_encoding(str(v))) for k, v in dictionary.items()])

>>> percent_urlencode({'token': '$%&/', 'username': 'me'})
'username=me&token=%24%25%26%2F'

>>> percent_urlencode({'site':'Stack Overflow','Coder':'Jeff Atwood'})
'site=Stack%20Overflow&Coder=Jeff%20Atwood'
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top