Question

I'm doing a simple Monte Carlo simulation exercise, using ipcluster engines of IPython. I've noticed a huge difference in execution time based on how I define my function, and I'm asking the reason for this. Here are the details:

When I definde the task as below, it is fast:

def sample(n):
    return (rand(n)**2 + rand(n)**2 <= 1).sum()

When run in parallel:

from IPython.parallel import Client
rc = Client()
v = rc[:]
with v.sync_imports():
from numpy.random import rand
n = 1000000

timeit -r 1 -n 1 print 4.* sum(v.map_sync(sample, [n]*len(v))) / (n*len(v))

3.141712
1 loops, best of 1: 53.4 ms per loop

But if I change the function to:

def sample(n):
    return sum(rand(n)**2 + rand(n)**2 <= 1)

I get:
3.141232 1 loops, best of 1: 3.81 s per loop

...which is 71 time slower. What can be the reason for this?

Was it helpful?

Solution

I can't go too in-depth, but the reason it is slower is because sum(<array>) is the built-in CPython sum function, whereas your <numpy array>.sum() is using the numpy sum function, which is substantially faster than the built-in python version.

I imagine you would get similar results if you replaced sum(<array>) with numpy.sum(<array>)

see numpy sum docs here: http://docs.scipy.org/doc/numpy/reference/generated/numpy.sum.html

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top