Question

I wrote a script to learn about gevent.pool.Pool, but I have seen a strange phonemenon.

In my code, I have three different code segments, named version 1, version 2 and version 3.

  • When commenting out version 2 and version 3, namely just using the imap() method in version 1, then there's nothing to happen.
  • When commenting out version 1 and version 3, namely just using the map() method in version 2, then I find that the first map() method creates two greenlets and then the two greenlets execute. After those two greenles are done, then the second map() method does the same thing.
  • When commenting out version 1 and version 2, namely first using the imap() method and then using map() method in version 3, I find that until the map() method has been executed, then the five greenlets are created and executed.

So I have two questions:

  • Why does map() method triggers the execution while imap() not?
  • Why does the Pool instance have a non-zero length after map() triggers the execution?

I have read the source code of pool.py in gevent-1.0, but I don't understand how the source code adds greenlets to the variable self.greenlets and the difference between map() and imap(). In my option, imap() just returns an iterable objects and map() returns a list of greenlets generated by imap().

Here is the source code of map() and imap() in pool.py of gevent:

def map(self, func, iterable):
    return list(self.imap(func, iterable))

def imap(self, func, iterable):
    """An equivalent of itertools.imap()"""
    return IMap.spawn(func, iterable, spawn=self.spawn)

Here is my test code:

#!/usr/bin/env python2.7
#coding: utf-8

import gevent
from gevent.pool import Pool
from gevent.coros import BoundedSemaphore


class TestSemaphore(object):

    def __init__(self):
        self.sem = BoundedSemaphore(1)
        self.pool = Pool()

    def run(self):
        # version 1 
        self.pool.imap(self._worker, xrange(0, 2))
        self.pool.imap(self._worker, xrange(3, 6))
        # end of version 1

        # version 2
        # self.pool.map(self._worker, xrange(0, 2))
        # self.pool.map(self._worker, xrange(3, 6))
        # end of version 2

        # version 3
        # self.pool.imap(self._worker, xrange(0, 2))
        # self.pool.map(self._worker, xrange(3, 6))
        # end of version 3

    def _worker(self, pid):
        with self.sem:
            print('worker %d acquired semaphore, length of pool is %d' % (pid, len(self.pool)))
            gevent.sleep(0)
        print('worker %d released semaphore, length of pool is %d' % (pid, len(self.pool))) 

if __name__ == '__main__':
    test = TestSemaphore()
    test.run()
Was it helpful?

Solution

The key thing to note is that imap is lazy - it does not do any work until you actually consume the resulting iterator:

>>> map(lamda x: x, xrange(0, 2))
[0, 1]

>>> from itertools import imap
>>> imap(lamda x: x, xrange(0, 2))
<generator object at 0xsome-address>

# Consume the resulting iterator
>>> list(imap(lamda x: x, xrange(0, 2)))
[0, 1]

imap in multiprocessing and gevent adheres to the same rules.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top