Question

I think I have a problem with Numpy and small numbers.

Can you help me finding a solution for the following problem:

import numpy as np

def gaussian(xx, mu=0, sigma=1):
    return 1./(np.sqrt(2*np.pi)*sigma)*np.exp(-(mu-xx)**2/(2*sigma**2))

factors = (1., 0.1, 0.01, 0.001, 0.0001)
for factor in factors:
    xx = np.linspace(5000, 5200, 1000)
    yy = 1.-(factor*xx*(1.+gaussian(xx, 5100)))
    step = xx[1] - xx[0]

    print np.sum((1.-yy/(1.-factor*xx))*step)

This code should evaluate to -1 for all of the different factors. But the output is :

1.0 -1.00019611689
0.1 -1.00196463662
0.01 -1.0200000008
0.001 -1.24390245353
0.0001 1.04081641153

So the problem is, that the smaller the factor is, the more I get into trouble because of what I think has to do with the precision of Numpy/Python.

What to do to evaluate the equation even for the small factors?

Thanks a lot for your help in advance.

Was it helpful?

Solution

This isn't a NumPy issue. Your expectation that the result should be -1 is incorrect.

You're effectively computing the definite integral of a function f(x) on a large interval around 5100. The function you're integrating simplifies to factor * x * gaussian(x) / (1 - factor * x). We can easily give a back-of-the-envelope estimate for the value of the integral for the factors you're using: the quantity factor * x / (1 - factor * x) will vary fairly slowly across the range of interest, which is about 5095 to 5105; for anything outside this range the gaussian will render the contribution negligible. So to a first approximation we can treat that quantity as constant. Then we're left with that constant times the integral of gaussian(x), which will be near enough 1. Thus the expected output should be somewhere in the region of factor * 5100 / (1 - factor * 5100). (Though that's not going to work so well when factor is close to 1 / 5100.)

So for example in the case that factor is 0.0001, factor * 5100 / (1 - factor * 5100) has value around 1.0408163265306123. This is close enough to the answer you're seeing to make it plausible that NumPy is doing more-or-less the right thing.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top