Domanda

I found this example from gevent documentation but i want to control the concurrent requests done by gevents:

import gevent
import urllib2

from gevent import monkey
monkey.patch_all()

    urls = ['http://www.google.com', 'http://www.example.com', 'http://www.python.org']

def print_head(url):
    print 'Starting %s' % url
    data = urllib2.urlopen(url).read()

    print '%s: %s bytes: %r' % (url, len(data), data[:50])

jobs = [gevent.spawn(print_head, url) for url in urls]
gevent.joinall(jobs, timeout=2)

How can I limit my connections to 50 if for example I have 5000 urls to request?

È stato utile?

Soluzione

You need to use gevent.pool. Typically such as this:

# 50 is your pool size
pool = gevent.pool.Pool(50)
for url in urls:
    pool.spawn(print_head, url)
pool.join(timeout=2)

You spawn directly in your fixed size pool, then wait for the pool to execute the requests. The timeout of 2 may be a little short if you have 5000 requests, though.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top