在 Python 中解析 HTTP 响应

https://stackoverflow.com//questions/23049767

21-12-2019
|

题

我想操纵信息这网址。我可以成功打开它并阅读其内容。但我真正想做的是扔掉所有我不想要的东西，并操纵我想要保留的东西。

有没有办法将字符串转换为字典以便我可以迭代它？或者我只需要按原样解析它（str 类型）？

from urllib.request import urlopen

url = 'http://www.quandl.com/api/v1/datasets/FRED/GDP.json'
response = urlopen(url)

print(response.read()) # returns string with info

解决方案

当我打印时 response.read() 我注意到 b 被预先添加到字符串中（例如 b'{"a":1,..）。“b”代表字节，用作您正在处理的对象类型的声明。因为，我知道可以使用以下命令将字符串转换为字典 json.loads('string'), ，我只需要将字节类型转换为字符串类型。我通过解码对 utf-8 的响应来做到这一点 decode('utf-8'). 。一旦它是字符串类型，我的问题就解决了，我可以轻松地迭代 dict.

我不知道这是否是最快或最“Pythonic”的编写方式，但它有效，并且以后总是有时间进行优化和改进！我的解决方案的完整代码：

from urllib.request import urlopen
import json

# Get the dataset
url = 'http://www.quandl.com/api/v1/datasets/FRED/GDP.json'
response = urlopen(url)

# Convert bytes to string type and string type to dict
string = response.read().decode('utf-8')
json_obj = json.loads(string)

print(json_obj['source_name']) # prints the string with 'source_name' key

其他提示

您还可以使用Python的请求库。

import requests

url = 'http://www.quandl.com/api/v1/datasets/FRED/GDP.json'    
response = requests.get(url)    
dict = response.json()

现在你可以像python字典一样操纵“dict”。

json 在 Python 3 中使用 Unicode 文本（JSON 格式本身仅根据 Unicode 文本定义），因此您需要解码 HTTP 响应中收到的字节。 r.headers.get_content_charset('utf-8') 获取你的字符编码：

#!/usr/bin/env python3
import io
import json
from urllib.request import urlopen

with urlopen('https://httpbin.org/get') as r, \
     io.TextIOWrapper(r, encoding=r.headers.get_content_charset('utf-8')) as file:
    result = json.load(file)
print(result['headers']['User-Agent'])

没有必要使用 io.TextIOWrapper 这里：

#!/usr/bin/env python3
import json
from urllib.request import urlopen

with urlopen('https://httpbin.org/get') as r:
    result = json.loads(r.read().decode(r.headers.get_content_charset('utf-8')))
print(result['headers']['User-Agent'])

我猜事情在Python 3.4中发生了变化。这对我有用：

print("resp:" + json.dumps(resp.json()))

许可以下： CC-BY-SA 和归因

不隶属于 StackOverflow