Question

I have two large multidimensional arrays: Y carries three measurements of half a million objects (e.g. shape=(500000,3)) and X has same shape, but contains position of Y measurements.

At first, I would like for each row, containing an object, to fit a polynomial equation. I know that iterating over arrays are quite slow, but what I'm doing for the moment is:

fit = array([polyfit(X[i],Y[i],deg) for i in xrange(obs.shape[0])])

My question is: is there any possibility of fitting each row of both arrays without explicitly iterating over them?

Was it helpful?

Solution

It is possible to do so without iterate along the first axis. However, your second axis is rather short (being just 3), you can really fit no more than 2 coefficients.

In [67]:

import numpy as np
import scipy.optimize as so

In [68]:

def MD_ployError(p, x, y):
    '''if x has the shape of (n,m), y must be (n,m), p must be (n*p, ), where p is degree'''
    #d is no. of degree
    p_rshp=p.reshape((x.shape[0], -1))
    f=y*1.
    for i in range(p_rshp.shape[1]):
        f-=p_rshp[:,i][:,np.newaxis]*(x**i)
    return (f**2).sum()

In [69]:

X=np.random.random((100, 6))
Y=4+2*X+3*X*X
P=(np.zeros((100,3))+[1,1,1]).ravel()

In [70]:

MD_ployError(P, X, Y)

Out[70]:
11012.2067606684

In [71]:

R=so.fmin_slsqp(MD_ployError, P, args=(X, Y))
Iteration limit exceeded    (Exit mode 9) #you can increase iteration limit, but the result is already good enough.
            Current function value: 0.00243784856039
            Iterations: 101
            Function evaluations: 30590
            Gradient evaluations: 101

In [72]:

R.reshape((100, -1))

Out[72]:
array([[ 3.94488512,  2.25402422,  2.74773571],
       [ 4.00474864,  1.97966551,  3.02010015],
       [ 3.99919559,  2.0032741 ,  2.99753804],
..............................................)

OTHER TIPS

Yes, if you use the new numpy polyfit from np.polynomial, not the old np.polyfit:

X = np.arange(3)
Y = np.random.rand(10000, 3)

fit = np.array([np.polyfit(X, y, 2) for y in Y])
fits = np.polynomial.polynomial.polyfit(X, Y.T, 2)

assert np.allclose(fit.T[::-1], fits)

Timing:

In [692]: timeit fit = np.array([np.polyfit(X, y, 2) for y in Y])
 1 loops, best of 3: 2.22 s per loop

In [693]:  timeit fits = np.polynomial.polynomial.polyfit(X, Y.T, 2)
100 loops, best of 3: 3.63 ms per loop
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top