Вопрос

In Tornado we can use the coroutine decorator to write an asynchronous function neatly as a Python generator, where each yield statement returns to the scheduler and the final raise/return returns a single value to the caller. But is there any way to return a sequence of values to the caller, interspersed with asynchronous calls?

E.g. how could I turn this synchronous function:

def crawl_site_sync(rooturi):
    rootpage = fetch_page_sync(rooturi)
    links = extract_links(rootpage)
    for link in links:
        yield fetch_page_sync(link.uri)

...which I can call like this:

for page in crawl_site_sync("http://example.com/page.html"):
    show_summary(page)

...into a similar-looking asynchronous function in Tornado? E.g.:

@tornado.gen.coroutine
def crawl_site_async(rooturi):
    # Yield a future to the scheduler:
    rootpage = yield fetch_page_async(rooturi)
    links = extract_links(rootpage)
    for link in links:
        # Yield a future to the scheduler:
        sub_page = yield fetch_page_async(link.uri)
        # Yield a value to the caller:
        really_really_yield sub_page # ???

And how would I call it?

for page in yield crawl_site_sync("http://example.com/page.html"):
    # This won't work, the yield won't return until the entire
    # coroutine has finished, and it won't give us an iterable.
    show_summary(page)

I can think of ways to get it done, but all of them involve changing the call-site and the function to such a degree that it completely loses the benefit of the asynchronous version looking very similar to the synchronous version, and it no longer composes cleanly. I feel like I must be missing a trick here. Is there some way to simultaneously use a Python generator as a sequence of lazily computed values and as a Tornado coroutine?

Это было полезно?

Решение

I'd use a Queue from Toro, which is designed for coroutines to cooperate like this. Here's a simple example:

from tornado.ioloop import IOLoop
from tornado import gen
from tornado.httpclient import AsyncHTTPClient
from toro import Queue

q = Queue(maxsize=1)


@gen.coroutine
def consumer():
    item = yield q.get()
    while item:
        print item
        item = yield q.get()


@gen.coroutine
def producer():
    try:
        client = AsyncHTTPClient()
        for url in [
                'http://tornadoweb.org',
                'http://python.org',
                'http://readthedocs.org']:
            response = yield client.fetch(url)
            item = (url, len(response.body))
            yield q.put(item)

        # Done.
        q.put(None)
    except Exception:
        IOLoop.current().stop()
        raise

future = producer()
IOLoop.current().run_sync(consumer, timeout=20)

A more detailed web crawler example is in Toro's docs, here:

https://toro.readthedocs.org/en/stable/examples/web_spider_example.html

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top