numpy: evaluating function in matrix, using previous array as argument in calculating the next

Question 1

This sort of thing is very difficult to do in numpy. If we look at this by column we see a few simpler solutions.

a[:,0] is very easy:

col0 = np.ones((1000))*2
col0[0] = 1                  #Or whatever start value.
np.cumprod(col0, out=col0)

np.allclose(col0, a[:1000,0])
True

As mentioned earlier this will overflow very quickly. a[:,1] can be done much along the same lines.

I do not believe there is a way to do the next two columns inside numpy alone quickly. We can turn to numba for this:

from numba import auotojit

def python_loop(start, count):
     out = np.zeros((count), dtype=np.double)
     out[0] = start
     for x in xrange(count-1):
         out[x+1] = out[x] + np.cos(out[x+1])
     return out

numba_loop = autojit(python_loop)

np.allclose(numba_loop(3,1000),a[:1000,2])
True

%timeit python_loop(3,1000000)
1 loops, best of 3: 4.14 s per loop

%timeit numba_loop(3,1000000)
1 loops, best of 3: 42.5 ms per loop

Although its worth pointing out that this converges to pi/2 very very quickly and there is little point in calculating this recursion past ~20 values for any start value. This returns the exact same answer to double point precision- I didn't bother finding the cutoff, but it is much less then 50:

%timeit tmp = np.empty((1000000)); 
        tmp[:50] = numba_loop(3,50);
        tmp[50:] = np.pi/2
100 loops, best of 3: 2.25 ms per loop

You can do something similar with the fourth column. Of course you can autojit all of the functions, but this gives you several different options to try out depending on numba usage:

Use cumprod for the first two columns
Use an approximation for column 3 (and possible 4) where only the first few iterations are calculated
Implement columns 3 and 4 in numba using autojit
Wrap everything inside of an autojit loop (the best option)
The way you have presented this all rows past ~200 will either be np.inf or np.pi/2. Exploit this.

Question 2

Slightly faster. Your first column is basicly 2^n. Calculating 2^n for n up to 1000000 is gonna overflow.. second column is even worse.

def calc(arr, t0=1E-6):
    u = arr[0]
    dt = 1E-6
    h = lambda x: np.random.random(1)*50.0

    def firstColGen(uStart):
        u = uStart
        while True:
            u += u
            yield u

    def secondColGen(uStart, A):
        u = uStart
        while True:
            u += u*A
            yield u

    def thirdColGen(uStart):
        u = uStart
        while True:
            u += np.cos(u)
            yield u

    def fourthColGen(uStart, h, t0, dt):
        u = uStart
        t = t0
        while True:
            u += h(u) * dt
            t += dt
            yield u

    first = firstColGen(u[0])
    second = secondColGen(u[1], A)
    third = thirdColGen(u[2])
    fourth = fourthColGen(u[3], h, t0, dt)

    for i in xrange(1, len(arr)):
        arr[i] = [first.next(), second.next(), third.next(), fourth.next()]