Fitting multiple distributions

https://stackoverflow.com/questions/12577179

03-07-2021
|

Question

I have a situation where two or more nd arrays, with some coefficients, should add up (roughly) to a third array.

array1*c1 + array2*c2 ... = array3

I'm looking for the c1 and c2 that make the first two arrays best approximate array3. I'm sure some way of doing this exists in scipy, but I'm not sure where to start. Is there are specific module I should begin with?

Solution

numpy.linalg.lstsq solves this for you. Object-oriented wrappers for that function, as well as more advanced regression models, are available in both scikit-learn and StatsModels.

(Disclaimer: I'm a scikit-learn developer, so this is not the most unbiased advice ever.)

OTHER TIPS

This is just linear regression (http://en.wikipedia.org/wiki/Ordinary_least_squares).

Let the matrix A be have columns of array1, array2, ... Let the vector a be array3 and x be a the column vector [c1,c2,...]'.

You want to solve the problem min_{x} (Ax-a)^2.

Taking the derivative and setting to zero gives 0=A'Ax-A'a, which gives the solution x=(A'A)^{-1}A'a.

In numpy this is numpy.linalg.solve(numpy.dot(A.T,A),numpy.dot(A.T,a)).

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow