urllib2 전체 HTTP 응답을 검색하지 않습니다

https://stackoverflow.com/questions/1824069

22-07-2019
|

문제

나는 왜 JSON 응답의 전체 내용을 다운로드 할 수 없는지에 대해 당황합니다. FriendFeed 사용 urllib2.

>>> import urllib2
>>> stream = urllib2.urlopen('http://friendfeed.com/api/room/the-life-scientists/profile?format=json')
>>> stream.headers['content-length']
'168928'
>>> data = stream.read()
>>> len(data)
61058
>>> # We can see here that I did not retrieve the full JSON
... # given that the stream doesn't end with a closing }
... 
>>> data[-40:]
'ce2-003048343a40","name":"Vincent Racani'

urllib2로 전체 응답을 검색하려면 어떻게해야합니까?

해결책

모든 데이터를 얻는 가장 좋은 방법 :

fp = urllib2.urlopen("http://www.example.com/index.cfm")

response = ""
while 1:
    data = fp.read()
    if not data:         # This might need to be    if data == "":   -- can't remember
        break
    response += data

print response

그 이유는 그 이유입니다 .read() 소켓의 특성을 고려할 때 전체 응답을 반환하는 것이 보장되지 않습니다. 나는 이것이 문서에서 논의된다고 생각했다 (아마도 urllib)하지만 나는 그것을 찾을 수 없다.

다른 팁

사용 tcpdump 실제 네트워크 상호 작용을 모니터링하려면 일부 클라이언트 라이브러리에 대한 사이트가 왜 고장 났는지 분석 할 수 있습니다. 테스트를 스크립팅하여 여러 번 반복해야하므로 문제가 일관되는지 확인할 수 있습니다.

import urllib2
url = 'http://friendfeed.com/api/room/friendfeed-feedback/profile?format=json'
stream = urllib2.urlopen(url)
expected = int(stream.headers['content-length'])
data = stream.read()
datalen = len(data)
print expected, datalen, expected == datalen

사이트가 일관되게 작동하여 실패를 찾는 예를 제시 할 수 없습니다. :)

끝날 때까지 Stream.Read ()를 계속 호출하십시오 ...

while data = stream.read() :
    ... do stuff with data

readlines()

또한 작동합니다

라이센스 : CC-BY-SA ~와 함께 속성

제휴하지 않습니다 StackOverflow