tornado curl http client CANNOT fetch binary file

https://stackoverflow.com/questions/18944275

29-06-2022
|

Domanda

I want to fetch a Image(GIF format) from a website.So I use tornado in-build asynchronous http client to do it.My code is like the following:

import tornado.httpclient
import tornado.ioloop
import tornado.gen
import tornado.web

tornado.httpclient.AsyncHTTPClient.configure("tornado.curl_httpclient.CurlAsyncHTTPClient")
http_client = tornado.httpclient.AsyncHTTPClient()

class test(tornado.web.RequestHandler):
    @tornado.gen.coroutine
    def get(self):
        content = yield http_client.fetch('http://www.baidu.com/img/bdlogo.gif')
        print('=====', type(content.body))

application = tornado.web.Application([
    (r'/', test)
    ])
application.listen(80)
tornado.ioloop.IOLoop.instance().start()

So when I visit the server it should fetch a gif file.However It catch a exception.

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x81 in position 8: invalid start byte
ERROR:tornado.application:Uncaught exception GET / (127.0.0.1)
HTTPRequest(protocol='http', host='127.0.0.1', method='GET', uri='/', version='HTTP/1.1', remote_ip='127.0.0.1', headers={'Accept-Language': 'zh-cn,zh;q=0.8,en-us;q=0.5,en;q=0.3', 'Accept-Encoding': 'gzip, deflate', 'Host': '127.0.0.1', 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8', 'User-Agent': 'Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130922 Firefox/17.0', 'Connection': 'keep-alive', 'Cache-Control': 'max-age=0', 'If-None-Match': '"da39a3ee5e6b4b0d3255bfef95601890afd80709"'})
Traceback (most recent call last):
  File "/usr/lib/python3.2/site-packages/tornado/web.py", line 1144, in _when_complete
    if result.result() is not None:
  File "/usr/lib/python3.2/site-packages/tornado/concurrent.py", line 129, in result
    raise_exc_info(self.__exc_info)
  File "<string>", line 3, in raise_exc_info
  File "/usr/lib/python3.2/site-packages/tornado/stack_context.py", line 302, in wrapped
    ret = fn(*args, **kwargs)
  File "/usr/lib/python3.2/site-packages/tornado/gen.py", line 550, in inner
    self.set_result(key, result)
  File "/usr/lib/python3.2/site-packages/tornado/gen.py", line 476, in set_result
    self.run()
  File "/usr/lib/python3.2/site-packages/tornado/gen.py", line 505, in run
    yielded = self.gen.throw(*exc_info)
  File "test.py", line 12, in get
    content = yield http_client.fetch('http://www.baidu.com/img/bdlogo.gif')
  File "/usr/lib/python3.2/site-packages/tornado/gen.py", line 496, in run
    next = self.yield_point.get_result()
  File "/usr/lib/python3.2/site-packages/tornado/gen.py", line 395, in get_result
    return self.runner.pop_result(self.key).result()
  File "/usr/lib/python3.2/concurrent/futures/_base.py", line 393, in result
    return self.__get_result()
  File "/usr/lib/python3.2/concurrent/futures/_base.py", line 352, in __get_result
    raise self._exception
tornado.curl_httpclient.CurlError: HTTP 599: Failed writing body (0 != 1024)
ERROR:tornado.access:500 GET / (127.0.0.1) 131.53ms

It seems to attempt to decode my binary file as UTF-8 text, which is unnecessary.IF I comment

tornado.httpclient.AsyncHTTPClient.configure("tornado.curl_httpclient.CurlAsyncHTTPClient")

out, which will use a simple http client instead of pycurl, it works well.(It tell me that the type of "content" is bytes)

So if it return a bytes object, why it tries to decode it? I think the problems is the pycurl or the wrapper of pycurl in tornado, right?

My python version is 3.2.5, tornado 3.1.1, pycurl 7.19.

Thanks!

Soluzione

pycurl 7.19 doesn't support Python 3. Ubuntu (and possibly other Linux distributions) ship a modified version of pycurl that partially works with Python 3, but it doesn't work with Tornado (https://github.com/facebook/tornado/issues/671), and fails with an exception that looks like the one you're seeing here.

Until there's a new version of pycurl that officially supports Python 3 (or you use the change suggested in that Tornado bug report), I'm afraid you'll need to either go back to Python 2.7 or use Tornado's simple_httpclient instead.

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow