Pergunta

I need to run the same function based on the same data a lot of times. For this I am using multiprocessing.Pool in order to speedup the computation.

from multiprocessing import Pool
import numpy as np
x=np.array([1,2,3,4,5])

def func(x): #this should be a function that takes 3 minutes 
    m=mean(x)
    return(m)

p=Pool(100)
mapper=p.map(multiple_cv,[x]*500)

The program works well but at the end I have 100 python processes opened and all my system starts to go very slow.

How can I solve this? Am

I using Pool in the wrong way? Should I use another function?

EDIT: using p = Pool(multiprocessing.cpu_count()) will my PC use 100% of it's power? Or there is something else I should use?

Foi útil?

Solução

In addition to limiting yourself to

p = Pool(multiprocessing.cpu_count())

I believe you want to do the following when you're finished as well...

p.close()

This should close out the process after it's completed.

Outras dicas

As a general rule, you don't want too many more pools than you have CPU cores, because your computer won't be able to parallelize the work beyond the number of cores available to actually do the processing. It doesn't matter if you've got 100 processes when your CPU can only process four thing simultaneously. A common practice is to do this

p = Pool(multiprocessing.cpu_count())
Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top