Question

I'm trying to do a linear fit to some data in numpy.

Ex (where w is the number of samples I have for that value, i.e. for the point (x=0, y=0) I only have 1 measurement and the value of that measurement is 2.2, but for the point (1,1) I have 2 measurements with a value of 3.5.

x = np.array([0, 1, 2, 3])
y = np.array([2.2, 3.5, 4.6, 5.2])
w = np.array([1, 2, 2, 1])

z = np.polyfit(x, y, 1, w = w)

So, now the question is: is it correct to use w=w in polyfit for these cases or should I use w = sqrt(w) of what should I use?

Also, how can I get the fit error from polyfit?

Was it helpful?

Solution

If you have normally distributed measurements, then your uncertainty in each value would be proportional to 1/sqrt(n) where n is the number of measurements. You want to weigh your fit by the inverse of your uncertainty, so your second guess is best: w=np.sqrt(n)

To get the covariance on your parameters, also give cov=True.

x = np.array([0, 1, 2, 3])
y = np.array([2.2, 3.5, 4.6, 5.2])
n = np.array([1, 2, 2, 1])

p, c = np.polyfit(x, y, 1, w=np.sqrt(n), cov=True)

The diagonals of your cov matrix are the individual variances on each parameter, and of course the off-diagonals are the covariances. So most likely what you want for "fit error" is the square root of these diagonals:

e = np.sqrt(np.diag(c))
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top