Question

I have a set of measured radii (t+epsilon+error) at an equally spaced angles. The model is circle of radius (R) with center at (r, Alpha) with added small noise and some random error values which are much bigger than noise.

The problem is to find the center of the circle model (r,Alpha) and the radius of the circle (R). But it should not be too much sensitive to random error (in below data points at 7 and 14).

Some radii could be missing therefore the simple mean would not work here.

I tried least square optimization but it significantly reacts on error.

Is there a way to optimize least deltas but not the least squares of delta in Python?

Model:
n=36
R=100
r=10
Alpha=2*Pi/6

Data points:
[95.85, 92.66, 94.14, 90.56, 88.08, 87.63, 88.12, 152.92, 90.75, 90.73, 93.93, 92.66, 92.67, 97.24, 65.40, 97.67, 103.66, 104.43, 105.25, 106.17, 105.01, 108.52, 109.33, 108.17, 107.10, 106.93, 111.25, 109.99, 107.23, 107.18, 108.30, 101.81, 99.47, 97.97, 96.05, 95.29]

Was it helpful?

Solution 2

Replying to your final question

Is there a way to optimize least deltas but not the least squares of delta in Python?

Yes, pick an optimization method (for example downhill simplex implemented in scipy.optimize.fmin) and use the sum of absolute deviations as a merit function. Your dataset is small, I suppose that any general purpose optimization method will converge quickly. (In case of non-linear least squares fitting it is also possible to use general purpose optimization algorithm, but it's more common to use the Levenberg-Marquardt algorithm which minimizes sums of squares.)

If you are interested when minimizing absolute deviations instead of squares has theoretical justification see Numerical Recipes, chapter Robust Estimation.

From practical side, the sum of absolute deviations may not have unique minimum. In the trivial case of two points, say, (0,5) and (1,9) and constant function y=a, any value of a between 5 and 9 gives the same sum (4). There is no such problem when deviations are squared.

If minimizing absolute deviations would not work, you may consider heuristic procedure to identify and remove outliers. Such as RANSAC or ROUT.

OTHER TIPS

It seems like your main problem here is going to be removing outliers. There are a couple of ways to do this, but for your application, your best bet is to probably just to remove items based on their distance from the median (Since the median is much less sensitive to outliers than the mean.)

If you're using numpy that would looks like this:

def remove_outliers(data_points, margin=1.5):
    nd = np.abs(data_points - np.median(data_points))
    s = nd/np.median(nd)
    return data_points[s<margin]

After which you should run least squares.

If you're not using numpy you can do something similar with native python lists:

def median(points):
    return sorted(points)[len(points)/2] # evaluates to an int in python2

def remove_outliers(data_points, margin=1.5):
    m = median(data_points)
    centered_points = [abs(point - m) for point in data_points]
    centered_median = median(centered_points)
    ratios = [datum/centered_median for datum in centered_points]
    return [point for i, point in enumerate(data_points) if ratios[i]>margin]

If you're looking to just not count outliers as highly you can just calculate the mean of your dataset, which is just a linear equivalent of the least-squares optimization.

If you're looking for something a little better I might suggest throwing your data through some kind of low pass filter, but I don't think that's really needed here.

A low-pass filter would probably be the best, which you can do as follows: (Note, alpha is a number you will have to fiddle with to get your desired output.)

def low_pass(data, alpha):
    new_data = [data[0]]
    for i in range(1, len(data)):
        new_data.append(alpha * data[i] + (1 - alpha) * new_data[i-1])
    return new_data

At which point your least squares optimization should work fine.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top