Python urllib2 진행 후크

https://stackoverflow.com/questions/2028517

19-09-2019
|

문제

Urllib2 HTTP 클라이언트를 사용하여 Python에서 다운로드 진행 막대를 만들려고합니다. API (및 Google에서)를 살펴 보았으며 urllib2로 진행 후크를 등록 할 수없는 것 같습니다. 그러나 구형 가득한 urllib에는이 기능이 있습니다.

urllib2를 사용하여 진행률 표시 줄이나보고 후크를 만드는 방법을 아는 사람이 있습니까? 아니면 비슷한 기능을 얻을 수있는 다른 해킹이 있습니까?

해결책

다음은 Anurag의 응답에 대한 접근 방식을 기반으로하는 완전히 작동하는 예입니다. 내 버전은 청크 크기를 설정하고 임의의보고 기능을 첨부 할 수 있습니다.

import urllib2, sys

def chunk_report(bytes_so_far, chunk_size, total_size):
   percent = float(bytes_so_far) / total_size
   percent = round(percent*100, 2)
   sys.stdout.write("Downloaded %d of %d bytes (%0.2f%%)\r" % 
       (bytes_so_far, total_size, percent))

   if bytes_so_far >= total_size:
      sys.stdout.write('\n')

def chunk_read(response, chunk_size=8192, report_hook=None):
   total_size = response.info().getheader('Content-Length').strip()
   total_size = int(total_size)
   bytes_so_far = 0

   while 1:
      chunk = response.read(chunk_size)
      bytes_so_far += len(chunk)

      if not chunk:
         break

      if report_hook:
         report_hook(bytes_so_far, chunk_size, total_size)

   return bytes_so_far

if __name__ == '__main__':
   response = urllib2.urlopen('http://www.ebay.com');
   chunk_read(response, report_hook=chunk_report)

다른 팁

덩어리로 데이터를 읽고 그 사이에하고 싶은 일을하는데, 예를 들어 스레드에서 실행하고 UI에 연결하는 등

import urllib2

urlfile = urllib2.urlopen("http://www.google.com")

data_list = []
chunk = 4096
while 1:
    data = urlfile.read(chunk)
    if not data:
        print "done."
        break
    data_list.append(data)
    print "Read %s bytes"%len(data)

산출:

Read 4096 bytes
Read 3113 bytes
done.

urlgrabber 진행 알림에 대한 내장 지원이 있습니다.

단순화 된 버전 :

temp_filename = "/tmp/" + file_url.split('/')[-1]
f = open(temp_filename, 'wb')
remote_file = urllib2.urlopen(file_url)

try:
    total_size = remote_file.info().getheader('Content-Length').strip()
    header = True
except AttributeError:
    header = False # a response doesn't always include the "Content-Length" header

if header:
    total_size = int(total_size)

bytes_so_far = 0

while True:
    buffer = remote_file.read(8192)
    if not buffer:
        sys.stdout.write('\n')
        break

    bytes_so_far += len(buffer)
    f.write(buffer)
    if not header:
        total_size = bytes_so_far # unknown size

    percent = float(bytes_so_far) / total_size
    percent = round(percent*100, 2)
    sys.stdout.write("Downloaded %d of %d bytes (%0.2f%%)\r" % (bytes_so_far, total_size, percent))

실제로 파일을 작성할 수 있도록 Triptych의 응답에 대한 사소한 수정 (Python3) :

from urllib.request import urlopen

def chunk_report(bytes_so_far, chunk_size, total_size):
    percent = float(bytes_so_far) / total_size
    percent = round(percent*100, 2)
    sys.stdout.write("Downloaded %d of %d bytes (%0.2f%%)\r" %
                     (bytes_so_far, total_size, percent))

    if bytes_so_far >= total_size:
        sys.stdout.write('\n')


def chunk_read(response, chunk_size=8192, report_hook=None):
    total_size = response.info().get("Content-Length").strip()
    total_size = int(total_size)
    bytes_so_far = 0
    data = b""

    while 1:
        chunk = response.read(chunk_size)
        bytes_so_far += len(chunk)

        if not chunk:
            break

        if report_hook:
            report_hook(bytes_so_far, chunk_size, total_size)

        data += chunk

    return data

용법:

with open(out_path, "wb") as f:
    response = urlopen(filepath)
    data_read = chunk_read(response, report_hook=chunk_report)

    f.write(data_read)

라이센스 : CC-BY-SA ~와 함께 속성

제휴하지 않습니다 StackOverflow