Given a list of integers, determine if 70% of the values are within 20% of one of the values
-
11-06-2021 - |
Pregunta
I want to check if the list values have some level of "closeness". Is there a good algorithm to do this? Bonus points for the most pythonic way.
Valid
[1,7,8,9]
[3,4,100,101,102,103,104,105]
Not Valid
[1,8,9]
[1,10]
[100,200,300,400,500]
Solución
For small lists this O(n^2) algorithm will suffice:
def is_close(l):
for n in l:
c = sum([1 for x in l if x >= 0.8 *n and x <= 1.2 * n])
if c >= 0.7 * len(l):
return True
return False
print is_close([1,7,8,9])
print is_close([3,4,100,101,102,103,104,105])
print is_close([1,8,9])
print is_close([1,10])
print is_close([100,200,300,400,500])
Output is:
True
True
False
False
False
Otros consejos
Look up variance: http://en.wikipedia.org/wiki/Variance
Here is a simple linear-time algorithm for an array a
that is already sorted (as in the examples, otherwise it needs to be sorted beforehand in O(n log n)
time). The idea is to construct and test each maximal subsequence that starts at a given position low
.
low = middle = high = 1
while (low <= length (a))
advance middle to the largest i such that a[i]*0.8<=a[low]
advance high to the largest i such that a[i]<=a[middle]*1.2
if ((high-low+1)/length(a)>=0.7) output(true)
low = low + 1
return (false);
Since low
, middle
, and high
are always increased from 1
through length(a)
, the running time is always linear in length(a)
.
If the matching subsequence of a
is desired, one can output a[low]...a[high]
instead of true
.
Here is an algorithm that takes n logn
time.
sort the array
for i in range(len(array))
begin = binary search an index such that array[begin] >= array[i]*0.2
end = binary search an index such that array[end]*0.2 <= array[i]
if (end - begin) <= len(array) * 0.7
70% of the values are within %20 of array[i]
i.e all elements between begin and end are within 20% of array[i]
Several optimizations, including changing the iteration order, are possible.