solve linear equations given variables and uncertainties: scipy-optimize?

Question 1

This is my approach. Assuming x1-x4 are approximately normally distributed around each mean (1-sigma uncertainty), the problem is turning into one of minimizing the sum of square of errors, with 3 linear constrain functions. Therefore, we can attack it using scipy.optimize.fmin_slsqp()

In [19]:

def eq_f1(x):
    return (x*np.array([0.5, 1.0, 1.5, 2.0])).sum()-8
def eq_f2(x):
    return (x*np.array([0.0, 0.0, 1.0, 1.0])).sum()-4
def eq_f3(x):
    return (x*np.array([1.0, 1.0, 0.0, 0.0])).sum()-1
def error_f(x):
    error=(x-np.array([0.246, 0.749, 1.738, 2.248]))/np.array([0.007, 0.010, 0.009, 0.007])
    return (error*error).sum()
In [20]:

so.fmin_slsqp(error_f, np.array([0.246, 0.749, 1.738, 2.248]), eqcons=[eq_f1, eq_f2, eq_f3])
Optimization terminated successfully.    (Exit mode 0)
            Current function value: 2.17576389592
            Iterations: 4
            Function evaluations: 32
            Gradient evaluations: 4
Out[20]:
array([ 0.25056582,  0.74943418,  1.74943418,  2.25056582])

Question 2

I appear to me that I have a very similar problem. I am relatively new to py and used it mostly to sort and reduce data with pandas.

I have a set of linear equations, where I want to find the best fit parameters. However, the dataset has known uncertainties that need to be considered given in parentheses).

x1*99(1)+x2*45(1)=52(0.2)
x1*1(0.5)+x2*16(1)=15(0.1)

Moreover there are constraints:

x1>=0
x2>=0
x1+x2=1

My approach would be to treat the equations as constraints and solve the sum of the residues as it has been shown in the example above.

Solving this without uncertainties is not the issue. I ask to get a hint on how to account for the uncertainties while finding the best fit parameters.

Question 3

As given, the problem has no solution. This is because if the inputs x1, x2, x3 and x4 are gaussian, then the outputs:

y1 = 0.5 * x1 + 1.0 * x2 + 1.5 * x3 + 2.0 * x4 - 8.0
y2 = 0.0 * x1 + 0.0 * x2 + 1.0 * x3 + 1.0 * x4 - 4.0
y3 = 1.0 * x1 + 1.0 * x2 + 0.0 * x3 + 0.0 * x4 - 1.0

are also gaussian. Assuming that x1, x2, x3 and x4 are independent random variables, this is easy to see with OpenTURNS:

import openturns as ot
x1 = ot.Normal(0.246, 0.007)
x2 = ot.Normal(0.749, 0.010)
x3 = ot.Normal(1.738, 0.009)
x4 = ot.Normal(2.248, 0.007)
y1 = 0.5 * x1 + 1.0 * x2 + 1.5 * x3 + 2.0 * x4 - 8.0
y2 = 0.0 * x1 + 0.0 * x2 + 1.0 * x3 + 1.0 * x4 - 4.0
y3 = 1.0 * x1 + 1.0 * x2 + 0.0 * x3 + 0.0 * x4 - 1.0

The following script produces the graph:

graph1 = y1.drawPDF()
graph1.setLegends(["y1"])
graph2 = y2.drawPDF()
graph2.setLegends(["y2"])
graph3 = y3.drawPDF()
graph3.setLegends(["y3"])
graph1.add(graph2)
graph1.add(graph3)
graph1.setColors(["dodgerblue3",
                   "darkorange1", 
                   "forestgreen"])
graph1.setXTitle("Y")

The previous script produces the following output.

Given the location of the 0.0 in this distribution, I would say that solving the equations is mathematically impossible, but physically consistent with the data.

Actually, I guess that the gaussian distributions you gave for x1, ..., x4 are estimated from data. So I would rather reformulate the problem as follows:

Given a sample of observed values of x1, x2, x3, x4, what is the value of e1, e2, e3 which is so that :

y1 = 0.5 * x1 + 1.0 * x2 + 1.5 * x3 + 2.0 * x4 - 8 + e1 = 0
y2 = 0.0 * x1 + 0.0 * x2 + 1.0 * x3 + 1.0 * x4 - 4 + e2 = 0
y3 = 1.0 * x1 + 1.0 * x2 + 0.0 * x3 + 0.0 * x4 - 1 + e3 = 0

This turns the problem into an inversion problem, which can be solved by calibrating e1, e2, e3. Furthermore, given the finite sample size of x1, ..., x4, we might want to produce the distribution of e1, e2, e3. This can be done by bootstraping the input / output pairs (x, y): the distribution of e1, e2, e3 then reflects the variability of these parameters depending on the sample at hand.

First, we have to generate a sample from the distribution (I suppose that you have this sample, but did not publish it so far):

distribution = ot.ComposedDistribution([x1, x2, x3, x4])
sampleSize = 10
xobs = distribution.getSample(sampleSize)

Then we define the model:

formulas = [
    "y1 := 0.5 * x1 + 1.0 * x2 + 1.5 * x3 + 2.0 * x4 + e1 - 8.0",
    "y2 := 0.0 * x1 + 0.0 * x2 + 1.0 * x3 + 1.0 * x4 + e2 - 4.0",
    "y3 := 1.0 * x1 + 1.0 * x2 + 0.0 * x3 + 0.0 * x4 + e3 - 1.0"
]
program = ";".join(formulas)
g = ot.SymbolicFunction(["x1", "x2", "x3", "x4", "e1", "e2", "e3"],
                        ["y1", "y2", "y3"], 
                        program)

And set the observed outputs, which is a sample of zeros:

yobs = ot.Sample(sampleSize, 3)

We start with initial values equal to zero, and define the function to calibrate:

e1Initial = 0.0
e2Initial = 0.0
e3Initial = 0.0
thetaPrior = ot.Point([e1Initial,e2Initial,e3Initial])
calibratedIndices = [4, 5, 6]
mycf = ot.ParametricFunction(g, calibratedIndices, thetaPrior)

Then we can calibrate the model:

algo = ot.NonLinearLeastSquaresCalibration(mycf, xobs, yobs, thetaPrior)
algo.run()
calibrationResult = algo.getResult()
print(calibrationResult.getParameterMAP())

This prints:

[0.0265988,0.0153057,0.00495758]

This means that the errors e1, e2, e3 are rather small. We can compute a confidence interval:

thetaPosterior = calibrationResult.getParameterPosterior()
print(thetaPosterior.computeBilateralConfidenceIntervalWithMarginalProbability(0.95)[0])

This prints:

[0.0110046, 0.0404756]
[0.00921992, 0.0210059]
[-0.00601084, 0.0156665]

The third parameter e3 might be zero, but neither e1 nor e2. Finally, we can get the distribution of the errors:

thetaPosterior = calibrationResult.getParameterPosterior()

and draw it:

graph1 = thetaPosterior.getMarginal(0).drawPDF()
graph2 = thetaPosterior.getMarginal(1).drawPDF()
graph3 = thetaPosterior.getMarginal(2).drawPDF()
graph1.add(graph2)
graph1.add(graph3)
graph1.setColors(["dodgerblue3",
                  "darkorange1", 
                  "forestgreen"])
graph1

This produces:

This shows that e3 might be zero given the variability in the observed inputs x1, ..., x4. But e1 and e2 cannot be zero. The conclusion for this sample is that the third equation is approximately solved by the observed values of x1, ..., x4, but not the two first equations.