Question

I would like to encode user input values coming from a web interface in such a way that I can pass the data into my system in a safe way. In other words, I want to strip out what I consider bad characters such as quotes and brackets, etc.

I could use base64 and it would work fine. However, I'd like to be able to read the strings if they were originally alphanumeric in a human readable format at the lower layers.

So, 'Nice string' would be encoded to 'Nice string' but 'N@sty!!``string!!))' would be encoded to something like "N=E2sty=B3=B3=XY=XYstring=B3=B3=B1=B1". I just made that up but you get the idea.

Does such an encoding format exist and particularly does it exist in python.

Was it helpful?

Solution

How about:

urllib.quote("'N@sty!!`` string,!!))'",safe=" ,.").replace('%','=')
'=27N=40sty=21=21=60=60 string,=21=21=29=29=27'

OTHER TIPS

You could use urllib.quote:

>>> urllib.quote("Nice String");
'Nice%20String'
>>> urllib.quote("N@sty!!``string!!))");
'N%40sty%21%21%60%60string%21%21%29%29'

I think Punycode would meet your needs.

import encodings.punycode
encoded = encodings.punycode.punycode_encode(inputstr)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top