Question

When I have to parallelize an algorithm in python I usually use the multiprocessing map function.

In sklearn randomized Lasso it seems that they are using something of different RandomizedLasso

I am not very expert of parallel computing in python and I hope that I can learn something new from this.

Can anyone explain me what are they using?
In their situation I would have used multiprocessing. Why did they choose something of different?

Was it helpful?

Solution

n_jobs is fed to joblib, which is used for all parallel processing in scikit-learn. As you can see on the joblib website, it's much easier to use than multiprocessing; it's also more feature-rich, as it can use either processes or threads (faster when executing C code) and has shared-memory support for NumPy arrays.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top