The multiprocessing
module is meant for this use case.
A simple complete example of its usage is:
import multiprocessing
def my_function(x):
"""The function you want to compute in parallel."""
x += 1
return x
if __name__ == '__main__':
pool = multiprocessing.Pool()
results = pool.map(my_function, [1,2,3,4,5,6])
print(results)
The call pool.map
will execute my_function
with argument 1
, then 2
etc but in parallel.
Note that my_function
takes only one argument. If you have a function f
that takes n
arguments simply write a function f_helper
:
def f_helper(args):
f(*args)
And pack the arguments into a tuple. For example:
results = pool.map(f_helper, [(1,2,3), (4,5,6), (7,8,9)])
is equivalent to:
[f(1, 2, 3), f(4, 5, 6), f(7, 8, 9)]
but the calls to f
are executed in parallel.
Note: since the code will run in a different process, any side-effect of f
wont be preserved. For example if you modify the original argument, the main process will not see this change. You have to think that the arguments are copied and passed to the child process, which computes the result which is again copied into the main process.
If the function you are trying to compute doesn't take long enough the copying of arguments and return value can take more time then running the code serially.
The documentation contains various examples of usage of the module.