Question

What's the algorithm for computing a least squares plane in (x, y, z) space, given a set of 3D data points? In other words, if I had a bunch of points like (1, 2, 3), (4, 5, 6), (7, 8, 9), etc., how would one go about calculating the best fit plane f(x, y) = ax + by + c? What's the algorithm for getting a, b, and c out of a set of 3D points?

Was it helpful?

Solution

If you have n data points (x[i], y[i], z[i]), compute the 3x3 symmetric matrix A whose entries are:

sum_i x[i]*x[i],    sum_i x[i]*y[i],    sum_i x[i]
sum_i x[i]*y[i],    sum_i y[i]*y[i],    sum_i y[i]
sum_i x[i],         sum_i y[i],         n

Also compute the 3 element vector b:

{sum_i x[i]*z[i],   sum_i y[i]*z[i],    sum_i z[i]}

Then solve Ax = b for the given A and b. The three components of the solution vector are the coefficients to the least-square fit plane {a,b,c}.

Note that this is the "ordinary least squares" fit, which is appropriate only when z is expected to be a linear function of x and y. If you are looking more generally for a "best fit plane" in 3-space, you may want to learn about "geometric" least squares.

Note also that this will fail if your points are in a line, as your example points are.

OTHER TIPS

unless someone tells me how to type equations here, let me just write down the final computations you have to do:

first, given points r_i \n \R, i=1..N, calculate the center of mass of all points:

r_G = \frac{\sum_{i=1}^N r_i}{N}

then, calculate the normal vector n, that together with the base vector r_G defines the plane by calculating the 3x3 matrix A as

A = \sum_{i=1}^N (r_i - r_G)(r_i - r_G)^T

with this matrix, the normal vector n is now given by the eigenvector of A corresponding to the minimal eigenvalue of A.

To find out about the eigenvector/eigenvalue pairs, use any linear algebra library of your choice.

This solution is based on the Rayleight-Ritz Theorem for the Hermitian matrix A.

See 'Least Squares Fitting of Data' by David Eberly for how I came up with this one to minimize the geometric fit (orthogonal distance from points to the plane).

bool Geom_utils::Fit_plane_direct(const arma::mat& pts_in, Plane& plane_out)
{
    bool success(false);
    int K(pts_in.n_cols);
    if(pts_in.n_rows == 3 && K > 2)  // check for bad sizing and indeterminate case
    {
        plane_out._p_3 = (1.0/static_cast<double>(K))*arma::sum(pts_in,1);
        arma::mat A(pts_in);
        A.each_col() -= plane_out._p_3; //[x1-p, x2-p, ..., xk-p]
        arma::mat33 M(A*A.t());
        arma::vec3 D;
        arma::mat33 V;
        if(arma::eig_sym(D,V,M))
        {
            // diagonalization succeeded
            plane_out._n_3 = V.col(0); // in ascending order by default
            if(plane_out._n_3(2) < 0)
            {
                plane_out._n_3 = -plane_out._n_3; // upward pointing
            }
            success = true;
        }
    }
    return success;
}

Timed at 37 micro seconds fitting a plane to 1000 points (Windows 7, i7, 32bit program)

enter image description here

As with any least-squares approach, you proceed like this:

Before you start coding

  1. Write down an equation for a plane in some parameterization, say 0 = ax + by + z + d in thee parameters (a, b, d).

  2. Find an expression D(\vec{v};a, b, d) for the distance from an arbitrary point \vec{v}.

  3. Write down the sum S = \sigma_i=0,n D^2(\vec{x}_i), and simplify until it is expressed in terms of simple sums of the components of v like \sigma v_x, \sigma v_y^2, \sigma v_x*v_z ...

  4. Write down the per parameter minimization expressions dS/dx_0 = 0, dS/dy_0 = 0 ... which gives you a set of three equations in three parameters and the sums from the previous step.

  5. Solve this set of equations for the parameters.

(or for simple cases, just look up the form). Using a symbolic algebra package (like Mathematica) could make you life much easier.

The coding

  • Write code to form the needed sums and find the parameters from the last set above.

Alternatives

Note that if you actually had only three points, you'd be better just finding the plane that goes through them.

Also, if the analytic solution in unfeasible (not the case for a plane, but possible in general) you can do steps 1 and 2, and use a Monte Carlo minimizer on the sum in step 3.

CGAL::linear_least_squares_fitting_3

Function linear_least_squares_fitting_3 computes the best fitting 3D line or plane (in the least squares sense) of a set of 3D objects such as points, segments, triangles, spheres, balls, cuboids or tetrahedra.

http://www.cgal.org/Manual/latest/doc_html/cgal_manual/Principal_component_analysis_ref/Function_linear_least_squares_fitting_3.html

The equation for a plane is: ax + by + c = z. So set up matrices like this with all your data:

    x_0   y_0   1  
A = x_1   y_1   1  
          ... 
    x_n   y_n   1  

And

    a  
x = b  
    c

And

    z_0   
B = z_1   
    ...   
    z_n

In other words: Ax = B. Now solve for x which are your coefficients. But since (I assume) you have more than 3 points, the system is over-determined so you need to use the left pseudo inverse. So the answer is:

a 
b = (A^T A)^-1 A^T B
c

And here is some simple Python code with an example:

import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import numpy as np

N_POINTS = 10
TARGET_X_SLOPE = 2
TARGET_y_SLOPE = 3
TARGET_OFFSET  = 5
EXTENTS = 5
NOISE = 5

# create random data
xs = [np.random.uniform(2*EXTENTS)-EXTENTS for i in range(N_POINTS)]
ys = [np.random.uniform(2*EXTENTS)-EXTENTS for i in range(N_POINTS)]
zs = []
for i in range(N_POINTS):
    zs.append(xs[i]*TARGET_X_SLOPE + \
              ys[i]*TARGET_y_SLOPE + \
              TARGET_OFFSET + np.random.normal(scale=NOISE))

# plot raw data
plt.figure()
ax = plt.subplot(111, projection='3d')
ax.scatter(xs, ys, zs, color='b')

# do fit
tmp_A = []
tmp_b = []
for i in range(len(xs)):
    tmp_A.append([xs[i], ys[i], 1])
    tmp_b.append(zs[i])
b = np.matrix(tmp_b).T
A = np.matrix(tmp_A)
fit = (A.T * A).I * A.T * b
errors = b - A * fit
residual = np.linalg.norm(errors)

print "solution:"
print "%f x + %f y + %f = z" % (fit[0], fit[1], fit[2])
print "errors:"
print errors
print "residual:"
print residual

# plot plane
xlim = ax.get_xlim()
ylim = ax.get_ylim()
X,Y = np.meshgrid(np.arange(xlim[0], xlim[1]),
                  np.arange(ylim[0], ylim[1]))
Z = np.zeros(X.shape)
for r in range(X.shape[0]):
    for c in range(X.shape[1]):
        Z[r,c] = fit[0] * X[r,c] + fit[1] * Y[r,c] + fit[2]
ax.plot_wireframe(X,Y,Z, color='k')

ax.set_xlabel('x')
ax.set_ylabel('y')
ax.set_zlabel('z')
plt.show()

fitted plane

This reduces to the Total Least Squares problem, that can be solved using SVD decomposition.

C++ code using OpenCV:

float fitPlaneToSetOfPoints(const std::vector<cv::Point3f> &pts, cv::Point3f &p0, cv::Vec3f &nml) {
    const int SCALAR_TYPE = CV_32F;
    typedef float ScalarType;

    // Calculate centroid
    p0 = cv::Point3f(0,0,0);
    for (int i = 0; i < pts.size(); ++i)
        p0 = p0 + conv<cv::Vec3f>(pts[i]);
    p0 *= 1.0/pts.size();

    // Compose data matrix subtracting the centroid from each point
    cv::Mat Q(pts.size(), 3, SCALAR_TYPE);
    for (int i = 0; i < pts.size(); ++i) {
        Q.at<ScalarType>(i,0) = pts[i].x - p0.x;
        Q.at<ScalarType>(i,1) = pts[i].y - p0.y;
        Q.at<ScalarType>(i,2) = pts[i].z - p0.z;
    }

    // Compute SVD decomposition and the Total Least Squares solution, which is the eigenvector corresponding to the least eigenvalue
    cv::SVD svd(Q, cv::SVD::MODIFY_A|cv::SVD::FULL_UV);
    nml = svd.vt.row(2);

    // Calculate the actual RMS error
    float err = 0;
    for (int i = 0; i < pts.size(); ++i)
        err += powf(nml.dot(pts[i] - p0), 2);
    err = sqrtf(err / pts.size());

    return err;
}

All you'll have to do is to solve the system of equations.

If those are your points: (1, 2, 3), (4, 5, 6), (7, 8, 9)

That gives you the equations:

3=a*1 + b*2 + c
6=a*4 + b*5 + c
9=a*7 + b*8 + c

So your question actually should be: How do I solve a system of equations?

Therefore I recommend reading this SO question.

If I've misunderstood your question let us know.

EDIT:

Ignore my answer as you probably meant something else.

It sounds like all you want to do is linear regression with 2 regressors. The wikipedia page on the subject should tell you all you need to know and then some.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top