Pregunta

Say I do this:

import multiprocessing as mp

y = 10

def f(x) :
  return x + y

for i in xrange(2) :
  y = i
  pool = mp.Pool( processes = 2 )
  print pool.map( f, xrange(5) )

pool.close()
pool.join()

Output:

[0, 1, 2, 3, 4]
[1, 2, 3, 4, 5]

Ok, this is what I expected. But now let's move the declaration of pool outside of the for loop:

y = 10

def f(x) :
  return x + y

pool = mp.Pool( processes = 2 )

for i in xrange(2) :
  y = i
  print pool.map( f, xrange(5) )

Output:

[10, 11, 12, 13, 14]
[10, 11, 12, 13, 14]

The new value of y is ignored! What's going on?

¿Fue útil?

Solución

From https://docs.python.org/2/library/multiprocessing.html:

On Unix a child process can make use of a shared resource created in a parent process using a global resource. However, it is better to pass the object as an argument to the constructor for the child process.

This functionality (access to global variables from parent space) is achieved simple by copying all the data in namespace from parent to child namespace.

So, when you do

y = i
pool = mp.Pool( processes = 2 )

child process gets y with value of 0 (or 1, in the second run of the loop).

Similarly, in the code

y = 10
...
pool = mp.Pool( processes = 2 )
...
y = i

when child process is created, it gets a copy of parent's environment, where y is still 10. Any later changes on y in that environment will have no affect in child process.

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top