Python floating point error that has left me puzzled [duplicate]

https://stackoverflow.com/questions/22396161

14-06-2023
|

Domanda

I just recently ran into a problem where I needed to append numbers to a list only if they weren't in the list already, and then I had to run those numbers through a comparison later on. The problem arises in floating point arithmetic errors. To illustrate what is basically happening in my code:

_list = [5.333333333333333, 6.666666666666667, ...]
number = some_calculation()
if number not in _list:
    _list.append(number) #note that I can't use a set to remove
                         #duplicates because order needs to be maintained
new_list = []
for num in _list:
    if some_comparison(num): #note that I can't combine 'some_comparison' with the
        new_list.append(num) #above check to see if the item is already in the list

The problem is that some_calculation() would sometimes generate an inexact number, such as 5.333333333333332, which is, as far as my calculations need to go, the same as the first element in _list in this example. The solution I had in mind was to simply round all the numbers generated to 9 or so decimal places. This worked for a short amount of time, until I realized that some_comparison compares num against, again, an inexact calculation. Even if I didn't round the numbers in _list, some_comparison would still return an inexact value and thus would evaluate to False.

I am absolutely puzzled. I've never had to worry about floating point errors so this problem is quite infuriating. Does anyone have any ideas for solutions?

NOTE: I would post the actual code, but it's very convoluted and requires 7 or 8 different functions and classes I made specifically for this purpose, and reposting them here would be a hassle.

Soluzione

Make the comparison something like

if(abs(a-b) <= 1e-6 * (a + b)):

This is standard practice when using floating point. The real value you use (instead of 1e-6) depends on the magnitude of the numbers you use and your definition of "the same".

EDIT I added *(a+b) to give some robustness for values of different magnitudes, and changed the comparison to <= rather than < to cover the case where a==b==0.0.

Altri suggerimenti

You can subclass list and add in a tolerance to __contains__:

class ListOFloats(list):
    def __contains__(self, f):
        # If you want a different tolerance, set it like so:
        # l=ListOFloats([seq])
        # l.tol=tolerance_you_want    
        tol=getattr(self, 'tol', 1e-12)
        return any(abs(e-f) <= 0.5 * tol * (e + f) for e in self) 

_list = ListOFloats([5.333333333333333, 6.666666666666667]) 

print(5.333333333333333 in _list)
# True
print(6.66666666666666 in _list)
# True
print(6.66666666666 in _list)
# False

Use round on both the values in the list and the comparison values. They won't be exact but they'll be consistent, so a search will return the expected results.

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow