Importing scipy breaks multiprocessing support in Python

Question

After much digging around and posting an issue on the Scipy GitHub site, I've found a solution.

Before I start, this is documented very well here - I'll just give an overview.

This problem is not related to the version of Scipy, or Numpy that I was using. It originates in the system BLAS libraries that Numpy and Scipy use for various linear algebra routines. You can tell which libraries Numpy is linked to by running

python -c 'import numpy; numpy.show_config()'

If you are using OpenBLAS in Linux, you may find that the CPU affinity is set to 1, meaning that once these algorithms are imported in Python (via Numpy/Scipy), you can access at most one core of the CPU. To test this, within a Python terminal run

import os
os.system('taskset -p %s' %os.getpid())

If the CPU affinity is returned as f, of ff, you can access multiple cores. In my case it would start like that, but upon importing numpy or scipy.any_module, it would switch to 1, hence my problem.

I've found two solutions:

Change CPU affinity

You can manually set the CPU affinity of the master process at the top of the main function so that the code looks like this:

import multiprocessing
import numpy as np
import math
import time
import os

def compute_something(t):
    a = 0.
    for i in range(10000000):
        a = math.sqrt(t)
    return a

if __name__ == '__main__':

    pool_size = multiprocessing.cpu_count()
    os.system('taskset -cp 0-%d %s' % (pool_size, os.getpid()))

    print "Pool size:", pool_size
    pool = multiprocessing.Pool(processes=pool_size)

    inputs = range(10)

    tic = time.time()
    builtin_outputs = map(compute_something, inputs)
    print 'Built-in:', time.time() - tic

    tic = time.time()
    pool_outputs = pool.map(compute_something, inputs)
    print 'Pool    :', time.time() - tic

Note that selecting a value higher than the number of cores for taskset doesn't seem to matter - it just uses the maximum possible number.

Switch BLAS libraries

Solution documented at the site linked above. Basically: install libatlas and run update-alternatives to point numpy to ATLAS rather than OpenBLAS.