UNESCAPE Python字符串从HTTP

https://stackoverflow.com/questions/780334

13-09-2019
|

题

我已经得到了从HTTP报头中的字符串，但它已经逃脱了。我可以使用哪些函数来反转义呢？

myemail%40gmail.com -> myemail@gmail.com

会urllib.unquote（）是去的方式？

解决方案

我敢肯定的urllib的 unquote 是常见方式这样做。

>>> import urllib
>>> urllib.unquote("myemail%40gmail.com")
'myemail@gmail.com'

还有 unquote_plus ：

像引文结束（），但也用空格替换加上标志，作为需要unquoting HTML形式的值。

其他提示

是，似乎 urllib.unquote() 完成该任务。（I 测试它靠在键盘广告的例子。）

在Python 3中，这些功能是 urllib.parse.unquote 和 urllib.parse.unquote_plus 。

后者是例如用于在HTTP网址，其中，所述空格字符（）传统上编码为加号（+）查询字符串，并且+是百分比编码到%2B。

在除了这些存在 unquote_to_bytes 的是，给定的编码字符串bytes，其可在编码是未知的或编码的数据是二进制的数据被用于转换。但是没有unquote_plus_to_bytes，如果你需要它，你可以这样做：

def unquote_plus_to_bytes(s):
    if isinstance(s, bytes):
        s = s.replace(b'+', b' ')
    else:
        s = s.replace('+', ' ')
    return unquote_to_bytes(s)

上是否使用unquote或unquote_plus的更多信息可在 URL编码空格字符：+或％20

许可以下： CC-BY-SA 和归因

不隶属于 StackOverflow