Question

Using Python I need to transfer non utf-8 encoded data (specifically shift-jis) to a URL via the query string. How should I transfer the data? Quote it? Encode in utf-8?

Thanks

Was it helpful?

Solution

Query string parameters are byte-based. Whilst IRI-to-URI and typed non-ASCII characters will typically use UTF-8, there is nothing forcing you to send or receive your own parameters in that encoding.

So for Shift-JIS (actually typically cp932, the Windows extension of that encoding):

foo= u'\u65E5\u672C\u8A9E' # 日本語
url= 'http://www.example.jp/something?foo='+urllib.quote(foo.encode('cp932'))

In Python 3 you do it in the quote function itself:

foo= '\u65E5\u672C\u8A9E'
url= 'http://www.example.jp/something?foo='+urllib.parse.quote(foo, encoding= 'cp932')

OTHER TIPS

I don't know what unicode has to do with this, since the query string is a string of bytes. You can use the quoting functions in urllib to quote plain strings so that they can be passed within query strings.

By the »query string« you mean HTTP GET like in http:/{URL}?data=XYZ?

You have encoding what ever data you have via base64.b64encode using -_ as alternative character to be URL safe as an option. See here.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top