Question

I am numerically solving for x(t) for a system of first order differential equations. The system is:

dx/dt = y
dy/dt = -x - a*y(x^2 + y^2 -1)

I have implemented the Forward Euler method to solve this problem as follows:

def forward_euler():
    h = 0.01
    num_steps = 10000

    x = np.zeros([num_steps + 1, 2]) # steps, number of solutions
    y = np.zeros([num_steps + 1, 2])
    a = 1.

    x[0, 0] = 10. # initial condition 1st solution
    y[0, 0] = 5.

    x[0, 1] = 0.  # initial condition 2nd solution
    y[0, 1] = 0.0000000001

    for step in xrange(num_steps):
        x[step + 1] = x[step] + h * y[step]
        y[step + 1] = y[step] + h * (-x[step] - a * y[step] * (x[step] ** 2 + y[step] ** 2 - 1))

    return x, y

Now I would like to vectorize the code further and keep x and y in the same array, I have come up with the following solution:

def forward_euler_vector():
    num_steps = 10000
    h = 0.01

    x = np.zeros([num_steps + 1, 2, 2]) # steps, variables, number of solutions
    a = 1.

    x[0, 0, 0] = 10. # initial conditions 1st solution
    x[0, 1, 0] = 5.  

    x[0, 0, 1] = 0.  # initial conditions 2nd solution
    x[0, 1, 1] = 0.0000000001

    def f(x): 
        return np.array([x[1],
                         -x[0] - a * x[1] * (x[0] ** 2 + x[1] ** 2 - 1)])

    for step in xrange(num_steps):
        x[step + 1] = x[step] + h * f(x[step])

    return x

The question: forward_euler_vector() works, but was this to best way to vectorize it? I am asking because the vectorized version runs about 20 ms slower on my laptop:

In [27]: %timeit forward_euler()
1 loops, best of 3: 301 ms per loop

In [65]: %timeit forward_euler_vector()
1 loops, best of 3: 320 ms per loop
Was it helpful?

Solution 2

@Ophion comment explains very well what's going on. The call to array() within f(x) introduces some overhead, that kills the benefit of the use of matrix multiplication in the expression h * f(x[step]).

And as he says, you may be interested in having a look at scipy.integrate for a nice set of numerical integrators.

To solve the problem at hand of vectorising your code, you want to avoid recreating the array every time you call f. You would like to initialize the array once, and return it modified at every call. This is similar to what a static variable is in C/C++.

You can achieve this with a mutable default argument, that is interpreted once, at the time of the definition of the function f(x), and that has local scope. Since it has to be mutable, you encapsulate it in a list of a single element:

 def f(x,static_tmp=[empty((2,2))]): 
    static_tmp[0][0]=x[1]
    static_tmp[0][1]=-x[0] - a * x[1] * (x[0] ** 2 + x[1] ** 2 - 1)
    return static_tmp[0]

With this modification to your code, the overhead of array creation disappears, and on my machine I gain a small improvement:

%timeit forward_euler()        #258ms
%timeit forward_euler_vector() #248ms

This means that the gain of optimizing matrix multiplication with numpy is quite small, at least on the problem at hand.

You may want to get rid of the function f straight away as well, doing its operations within the for loop, getting rid of the call overhead. This trick of the default argument can however be applied also with scipy more general time integrators, where you must provide a function f.

EDIT: as pointed out by Jaime, another way to go is to treat static_tmp as an attribute of the function f, and to create it after having declared the function but before calling it:

 def f(x): 
    f.static_tmp[0]=x[1]
    f.static_tmp[1]=-x[0] - a * x[1] * (x[0] ** 2 + x[1] ** 2 - 1)
    return f.static_tmp
 f.static_tmp=empty((2,2))

OTHER TIPS

There is always the trivial autojit solution:

def forward_euler(initial_x, initial_y, num_steps, h):

     x = np.zeros([num_steps + 1, 2]) # steps, number of solutions
     y = np.zeros([num_steps + 1, 2])
     a = 1.

     x[0, 0] = initial_x[0] # initial condition 1st solution
     y[0, 0] = initial_y[0]

     x[0, 1] = initial_x[1]  # initial condition 2nd solution
     y[0, 1] = initial_y[1]

     for step in xrange(int(num_steps)):
         x[step + 1] = x[step] + h * y[step]
         y[step + 1] = y[step] + h * (-x[step] - a * y[step] * (x[step] ** 2 + y[step] ** 2 - 1))

     return x, y

Timings:

from numba import autojit
jit_forward_euler = autojit(forward_euler)

%timeit forward_euler([10,0], [5,0.0000000001], 1E4, 0.01)
1 loops, best of 3: 385 ms per loop

%timeit jit_forward_euler([10,0], [5,0.0000000001], 1E4, 0.01)
100 loops, best of 3: 3.51 ms per loop
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top