Frage

I need to run the same function based on the same data a lot of times. For this I am using multiprocessing.Pool in order to speedup the computation.

from multiprocessing import Pool
import numpy as np
x=np.array([1,2,3,4,5])

def func(x): #this should be a function that takes 3 minutes 
    m=mean(x)
    return(m)

p=Pool(100)
mapper=p.map(multiple_cv,[x]*500)

The program works well but at the end I have 100 python processes opened and all my system starts to go very slow.

How can I solve this? Am

I using Pool in the wrong way? Should I use another function?

EDIT: using p = Pool(multiprocessing.cpu_count()) will my PC use 100% of it's power? Or there is something else I should use?

War es hilfreich?

Lösung

In addition to limiting yourself to

p = Pool(multiprocessing.cpu_count())

I believe you want to do the following when you're finished as well...

p.close()

This should close out the process after it's completed.

Andere Tipps

As a general rule, you don't want too many more pools than you have CPU cores, because your computer won't be able to parallelize the work beyond the number of cores available to actually do the processing. It doesn't matter if you've got 100 processes when your CPU can only process four thing simultaneously. A common practice is to do this

p = Pool(multiprocessing.cpu_count())
Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top