I have an expression that is used to estimate percentiles by interpolating between two values.

windowMin + (currentPercentile - lastPercentile) * (windowMax - windowMin) / (percentile - lastPercentile)

This has given me very good real-world results. However, in my unit tests, I'm having trouble assering that things are working correctly, since I consistently get significant rounding error.

In three test cases, I try to get the 40th, 50th and 60th percentile, resulting in these computations:

1 + (0.4 - 0.3333333333333333) * (2 - 1) / (0.6666666666666666 - 0.3333333333333333)
1 + (0.5 - 0.3333333333333333) * (2 - 1) / (0.6666666666666666 - 0.3333333333333333)
1 + (0.6 - 0.3333333333333333) * (2 - 1) / (0.6666666666666666 - 0.3333333333333333)

This yields:

{
  "0.4": 1.2000000000000002,
  "0.5": 1.5,
  "0.6": 1.8
}

This fails my assertion, which is looking for 1.2 for the 40th percentile.

Is there a way to restructure this expression to improve accuracy in all cases? If not, is there an easy way to work around this issue with chai assertions?

有帮助吗?

解决方案 3

It happens that 1.2000000000000002 is already the double precision floating point value nearest to the exact interpolation you submitted, as illustrated with Pharo smalltalk expression below (asTrueFraction means that floating point value is converted to a Fraction having exactly the same value)

(1 + ((0.4 asTrueFraction - 0.3333333333333333 asTrueFraction) * (2 - 1) / (0.6666666666666666 asTrueFraction - 0.3333333333333333 asTrueFraction))) asFloat
-> 1.2000000000000002.

Even if you evaluate the interpolation with exact arithmetic, we can do that by replacing asTrueFraction with asMinimalDecimalFraction (which get you the decimal number with minimal number of digits that will be rounded to the same Float):

0.4 asTrueFraction -> (3602879701896397/9007199254740992).
0.4 asMinimalDecimalFraction -> (2/5).
0.3333333333333333 asMinimalDecimalFraction -> (3333333333333333/10000000000000000).
0.6666666666666666 asMinimalDecimalFraction -> (3333333333333333/5000000000000000).

Then you get again the same result, see how it decompose:

(1 + ((0.4 asMinimalDecimalFraction - 0.3333333333333333 asMinimalDecimalFraction) * (2 - 1) / (0.6666666666666666 asMinimalDecimalFraction - 0.3333333333333333 asMinimalDecimalFraction))) 
 -> (4000000000000000/3333333333333333).

(4000000000000000/3333333333333333) asFloat ->  1.2000000000000002.

In other words, if you want a result in floating point, then 1.2000000000000002 is the best value.

I don't say that interpolation formula will always be exact as it is written, it can cumulate round off errors, but it already performs a decent job on your input data.

Change the test rather than the formula, and insert explicit accuracy requirements.

其他提示

These rounding errors are a characteristic of floating point maths.

One possible solution might be to apply .toPrecision() to your calculations before returning the result:

var result = windowMin + (currentPercentile - lastPercentile) * (windowMax - windowMin) / (percentile - lastPercentile);
return result.toPrecision(6);  // returns six significant figures

or possibly toFixed():

return result.toFixed(2); // returns two decimal places.

The chai closeTo is designed to handle this sort of test:

expect(calculatedValue).to.be.closeTo(1.2, 0.000001);

The first argument is the expected value and the second is a delta indicating how close calculatedValue needs to be to 1.2.

I see two ways to solve that problem:

  1. If you are going to have only finite numbers in result you can round the numbers up to some little precision

  2. Division confuses you. Multiply both sides of equtation to the divider and you'll have no more infinite fractions in the result :)

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top