Pregunta

I want to check if the list values have some level of "closeness". Is there a good algorithm to do this? Bonus points for the most pythonic way.

Valid

[1,7,8,9]
[3,4,100,101,102,103,104,105]

Not Valid

[1,8,9]
[1,10]
[100,200,300,400,500]
¿Fue útil?

Solución

For small lists this O(n^2) algorithm will suffice:

def is_close(l):
    for n in l:
        c = sum([1 for x in l if x >= 0.8 *n and x <= 1.2 * n])
        if c >= 0.7 * len(l):
            return True
    return False

print is_close([1,7,8,9])
print is_close([3,4,100,101,102,103,104,105])
print is_close([1,8,9])
print is_close([1,10])
print is_close([100,200,300,400,500])

Output is:

True
True
False
False
False

Otros consejos

Here is a simple linear-time algorithm for an array a that is already sorted (as in the examples, otherwise it needs to be sorted beforehand in O(n log n) time). The idea is to construct and test each maximal subsequence that starts at a given position low.

low = middle = high = 1
while (low <= length (a))
   advance middle to the largest i such that a[i]*0.8<=a[low]
   advance high to the largest i such that a[i]<=a[middle]*1.2
   if ((high-low+1)/length(a)>=0.7) output(true)
   low = low + 1
return (false);

Since low, middle, and high are always increased from 1 through length(a), the running time is always linear in length(a).

If the matching subsequence of a is desired, one can output a[low]...a[high] instead of true.

Here is an algorithm that takes n logn time.

sort the array
for i in range(len(array))
    begin = binary search an index such that array[begin] >= array[i]*0.2
    end = binary search an index such that array[end]*0.2 <= array[i]
    if (end - begin) <= len(array) * 0.7
        70% of the values are within %20 of array[i]
        i.e all elements between begin and end are within 20% of array[i]

Several optimizations, including changing the iteration order, are possible.

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top