Domanda

I have an array of cartesian points (column 1 is x values and column 2 is y values) like so:

308 522
307 523
307 523
307 523
307 523
307 523
306 523

How would I go about getting a standard deviation of the points? It would be compared to the mean, which would be a straight line. The points are not that straight line, so then the standard deviation describes how wavy or "off-base" from the straight line the line segment is.

I really appreciate the help.

È stato utile?

Soluzione

If you are certain the xy data describe a straight line, you'd do the following.

Finding the best fitting straight line equals solving the over-determined linear system Ax = b in a least-squares sense, where

xy = [
308 522
307 523
307 523
307 523
307 523
307 523
306 523];

x_vals = xy(:,1);
y_vals = xy(:,2);

A = [x_vals ones(size(x_vals))];
b = y_vals;

This can be done in Matlab like so:

sol = A\b;

m = sol(1);
c = sol(2);

What we've done now is find the values for m and c so that the line described by the equation y = mx+c best-fits the data you've given. This best-fit line is not perfect, so it has errors w.r.t. the y-data:

errs = (m*x_vals + c) - y_vals;

The standard deviation of these errors can be computed like so:

>> std(errs)
ans = 
    0.2440

If you want to use the perpendicular distance to the line (Euclidian distance), you'll have to include a geometric factor:

errs = (m*x_vals + c) - y;
errs_perpendicular = errs * cos(atan(m));

Using trig identities this can be reworked to

errs_perpendicular = errs * 1/sqrt(1+m*m);

and of course,

>> std(errs_perpendicular)
ans = 
    0.2182

If you are not certain that a straight line fits through the data and/or your xy data essentially describe a point cloud around some common centre, you'd do the following.

Find the center of mass (COM):

COM = mean(xy);

the distances of all points to the COM:

dists = sqrt(sum(bsxfun(@minus, COM, xy).^2,2));

and the standard deviation thereof:

>> std(dists)
ans =  
    0.5059

Altri suggerimenti

The mean of a set of two-dimensional values is another two-dimensional value, i.e. it's a point, not a line. This point is also known as the centre of mass, I believe.

It's not entirely clear what standard deviation is in this case, but I think it would make sense to define it in terms of distance from the mean.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top