Python での HTTP レスポンスの解析

https://stackoverflow.com//questions/23049767

21-12-2019
|

質問

情報操作したいこれ URL。正常に開いて内容を読むことができます。でも、私が本当にしたいのは、いらないものをすべて捨てて、残しておきたいものを操作することです。

文字列を辞書に変換して反復できるようにする方法はありますか?それともそのまま（str型）解析するしかないのでしょうか？

from urllib.request import urlopen

url = 'http://www.quandl.com/api/v1/datasets/FRED/GDP.json'
response = urlopen(url)

print(response.read()) # returns string with info

解決

印刷したとき response.read() きがついた b 文字列の先頭に付加されました (例: b'{"a":1,..）。「b」はバイトを表し、処理しているオブジェクトのタイプの宣言として機能します。を使用して文字列を辞書に変換できることを知っていたため、 json.loads('string'), バイト型を文字列型に変換するだけで済みました。私はutf-8への応答をデコードすることでこれを行いました decode('utf-8'). 。文字列型になると問題は解決され、簡単に反復処理できるようになりました。 dict.

これが最速の書き方なのか、最も「Python的」な書き方なのかはわかりませんが、うまくいきますし、後から最適化や改善をする時間が必ずあります。私のソリューションの完全なコード:

from urllib.request import urlopen
import json

# Get the dataset
url = 'http://www.quandl.com/api/v1/datasets/FRED/GDP.json'
response = urlopen(url)

# Convert bytes to string type and string type to dict
string = response.read().decode('utf-8')
json_obj = json.loads(string)

print(json_obj['source_name']) # prints the string with 'source_name' key

他のヒント

代わりにPythonの要求ライブラリを使用することもできます。

import requests

url = 'http://www.quandl.com/api/v1/datasets/FRED/GDP.json'    
response = requests.get(url)    
dict = response.json()

今すぐPython辞書のような「Dict」を操作できます。

json Python 3 では Unicode テキストを処理します (JSON 形式自体は Unicode テキストに関してのみ定義されています)。そのため、HTTP 応答で受信したバイトをデコードする必要があります。 r.headers.get_content_charset('utf-8') 文字エンコーディングを取得します。

#!/usr/bin/env python3
import io
import json
from urllib.request import urlopen

with urlopen('https://httpbin.org/get') as r, \
     io.TextIOWrapper(r, encoding=r.headers.get_content_charset('utf-8')) as file:
    result = json.load(file)
print(result['headers']['User-Agent'])

使用する必要はありません io.TextIOWrapper ここ：

#!/usr/bin/env python3
import json
from urllib.request import urlopen

with urlopen('https://httpbin.org/get') as r:
    result = json.loads(r.read().decode(r.headers.get_content_charset('utf-8')))
print(result['headers']['User-Agent'])

Python 3.4で物事が変わったと思います。これは私のために働きました：

print("resp:" + json.dumps(resp.json()))

ライセンス： CC-BY-SA と帰属

所属していません StackOverflow