how to call python scipy quad with array inputs

https://stackoverflow.com/questions/11891057

25-06-2021
|

Domanda

I am using nested scipy.integrate.quad calls to integrate a 2 dimensional integrand. The integrand is made of numpy functions - so it is much more efficient to pass it an array of inputs - than to loop through the inputs and call it once for each one - it is ~2 orders of magnitude faster because of numpy's arrays.

However.... if I want to integrate my integrand over only one dimension - but with an array of inputs over the other dimension things fall down - it seems like the 'scipy' quadpack package isn't able to do whatever it is that numpy does to handle arrayed inputs. Has anyone else seen this - and or found a way of fixing it - or am i misunderstanding it. The error i get from quad is :

Traceback (most recent call last):
  File "C:\Users\JP\Documents\Python\TestingQuad\TestingQuad_v2.py", line 159, in <module>
    fnIntegrate_x(0, 1, NCALLS_SET, True)
  File "C:\Users\JP\Documents\Python\TestingQuad\TestingQuad_v2.py", line 35, in fnIntegrate_x
    I = Integrate_x(yarray)
  File "C:\Users\JP\Documents\Python\TestingQuad\TestingQuad_v2.py", line 23, in Integrate_x
    return quad(Integrand, 0, np.pi/2, args=(y))[0]
  File "C:\Python27\lib\site-packages\scipy\integrate\quadpack.py", line 247, in quad
    retval = _quad(func,a,b,args,full_output,epsabs,epsrel,limit,points)
  File "C:\Python27\lib\site-packages\scipy\integrate\quadpack.py", line 312, in _quad
    return _quadpack._qagse(func,a,b,args,full_output,epsabs,epsrel,limit)
quadpack.error: Supplied function does not return a valid float.

I have put a cartoon version of what i'm trying to do below - what i'm actually doing has a more complicated integrand but this is the gyst.

The meat is at the top - the bottom is doing benchmarking to show my point.

import numpy as np
import time

from scipy.integrate import quad


def Integrand(x, y):
    '''
        Integrand
    '''
    return np.sin(x)*np.sin( y )

def Integrate_x(y):
    '''
        Integrate over x given (y)
    '''
    return quad(Integrand, 0, np.pi/2, args=(y))[0]



def fnIntegrate_x(ystart, yend, nsteps, ArrayInput = False):
    '''

    '''

    yarray = np.arange(ystart,yend, (yend - ystart)/float(nsteps))
    I = np.zeros(nsteps)
    if ArrayInput :
        I = Integrate_x(yarray)
    else :
        for i,y in enumerate(yarray) :

            I[i] = Integrate_x(y)

    return y, I




NCALLS_SET = 1000
NSETS = 10

SETS_t = np.zeros(NSETS)

for i in np.arange(NSETS) :

    XInputs = np.random.rand(NCALLS_SET, 2)

    t0 = time.time()
    for x in XInputs :
        Integrand(x[0], x[1])
    t1 = time.time()
    SETS_t[i] = (t1 - t0)/NCALLS_SET

print "Benchmarking Integrand - Single Values:"
print "NCALLS_SET: ", NCALLS_SET
print "NSETS: ", NSETS    
print "TimePerCall(s): ", np.mean(SETS_t) , np.std(SETS_t)/ np.sqrt(SETS_t.size)        
print "TotalTime: ",np.sum(SETS_t) * NCALLS_SET
'''
Benchmarking Integrand - Single Values:
NCALLS_SET:  1000
NSETS:  10
TimePerCall(s):  1.23999834061e-05 4.06987868647e-06
'''








NCALLS_SET = 1000
NSETS = 10

SETS_t = np.zeros(NSETS)

for i in np.arange(NSETS) :

    XInputs = np.random.rand(NCALLS_SET, 2)

    t0 = time.time()
    Integrand(XInputs[:,0], XInputs[:,1])
    t1 = time.time()
    SETS_t[i] = (t1 - t0)/NCALLS_SET

print "Benchmarking Integrand - Array Values:"
print "NCALLS_SET: ", NCALLS_SET
print "NSETS: ", NSETS    
print "TimePerCall(s): ", np.mean(SETS_t) , np.std(SETS_t)/ np.sqrt(SETS_t.size)        
print "TotalTime: ",np.sum(SETS_t) * NCALLS_SET
'''
Benchmarking Integrand - Array Values:
NCALLS_SET:  1000
NSETS:  10
TimePerCall(s):  2.00009346008e-07 1.26497018465e-07
'''












NCALLS_SET = 1000
NSETS = 100

SETS_t = np.zeros(NSETS)

for i in np.arange(NSETS) :


    t0 = time.time()
    fnIntegrate_x(0, 1, NCALLS_SET, False)
    t1 = time.time()
    SETS_t[i] = (t1 - t0)/NCALLS_SET

print "Benchmarking fnIntegrate_x - Single Values:"
print "NCALLS_SET: ", NCALLS_SET
print "NSETS: ", NSETS    
print "TimePerCall(s): ", np.mean(SETS_t) , np.std(SETS_t)/ np.sqrt(SETS_t.size)        
print "TotalTime: ",np.sum(SETS_t) * NCALLS_SET
'''
NCALLS_SET:  1000
NSETS:  100
TimePerCall(s):  0.000165750000477 8.61204306241e-07
TotalTime:  16.5750000477
'''








NCALLS_SET = 1000
NSETS = 100

SETS_t = np.zeros(NSETS)

for i in np.arange(NSETS) :


    t0 = time.time()
    fnIntegrate_x(0, 1, NCALLS_SET, True)
    t1 = time.time()
    SETS_t[i] = (t1 - t0)/NCALLS_SET

print "Benchmarking fnIntegrate_x - Array Values:"
print "NCALLS_SET: ", NCALLS_SET
print "NSETS: ", NSETS    
print "TimePerCall(s): ", np.mean(SETS_t) , np.std(SETS_t)/ np.sqrt(SETS_t.size)        

'''
****  Doesn't  work!!!! *****
Traceback (most recent call last):
  File "C:\Users\JP\Documents\Python\TestingQuad\TestingQuad_v2.py", line 159, in <module>
    fnIntegrate_x(0, 1, NCALLS_SET, True)
  File "C:\Users\JP\Documents\Python\TestingQuad\TestingQuad_v2.py", line 35, in fnIntegrate_x
    I = Integrate_x(yarray)
  File "C:\Users\JP\Documents\Python\TestingQuad\TestingQuad_v2.py", line 23, in Integrate_x
    return quad(Integrand, 0, np.pi/2, args=(y))[0]
  File "C:\Python27\lib\site-packages\scipy\integrate\quadpack.py", line 247, in quad
    retval = _quad(func,a,b,args,full_output,epsabs,epsrel,limit,points)
  File "C:\Python27\lib\site-packages\scipy\integrate\quadpack.py", line 312, in _quad
    return _quadpack._qagse(func,a,b,args,full_output,epsabs,epsrel,limit)
quadpack.error: Supplied function does not return a valid float.

'''

Soluzione

It is possible through numpy.vectorize function. I had this problem for a long time and then came up to this vectorize function.

you can use it like this:

vectorized_function = numpy.vectorize(your_function)

output = vectorized_function(your_array_input)

Altri suggerimenti

Afraid I'm answering my own question with a negative here. I don't think it is possible. Seems like quad is some sort of port of a library written in something else - as such it is the library on the inside that defines how things are done - so it is probably not possible to do what i wanted without redesigning the library itself.

for other people with timing issues on multiple D integration, I found the best way was using a dedicated integration library. I found that 'cuba' seemed to have some pretty efficient multi D integration routines.

http://www.feynarts.de/cuba/

These routines are written in c so i ended up using SWIG to talk to them - and eventually also for efficiency re-wrote my integrand in c - which sped things up loads....

Use quadpy (a project of mine). It is fully vectorized, so can handle array-valued functions of any shape, and does so very fast.

I was having this issue with integrating probability density functions from -np.inf to np.inf over all dimensions.

I fixed it by creating a wrapper function taking in *args, converting args to a numpy array, and integrating the wrapper function.

I think using numpy's vectorize only integrates the subspace where all values are equal.

Here's an example:

from scipy.integrate import nquad
from scipy.stats import multivariate_normal

mean = [0., 0.]
cov = np.array([[1., 0.],
                [0., 1.]])

bivariate_normal = multivariate_normal(mean=mean, cov=cov)

def pdf(*args):
    x = np.array(args)
    return bivariate_normal.pdf(x)

integration_range = [[-18, 18], [-18, 18]]

nquad(pdf, integration_range)

Output: (1.000000000000001, 1.3429066352690133e-08)

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow