Question

How do I return a list of the 3 lowest values in another list. For example I want to get the 3 lowest values of this list:

in_list = [1, 2, 3, 4, 5, 6]
input: function(in_list, 3)
output: [1, 2, 3]
Était-ce utile?

La solution

You can use heapq.nsmallest:

>>> from heapq import nsmallest
>>> in_list = [1, 2, 3, 4, 5, 6]
>>> nsmallest(3, in_list)
[1, 2, 3]
>>>

Autres conseils

If you could sort,you can get frst 3 elements as below:

 alist=[6, 4, 3, 2, 5, 1]
 sorted(alist)[:3]

Output:

 [1,2,3]

even simpler without modules imported:

l =[3,8,9,10,2,4,1]
l1 = sorted(l)[:3]

hope this helps

Sorting is the reasonable approach. If you care about asymptotic complexity though, you'd want to do this in time O(n) and space O(1).

def k_min(values, k):
    return sorted(values)[:k]

Calling to sorted() can give you time only O(n*log n) and space O(n), so to achieve O(n) time and O(1) space, different approach is needed.

For that, you'd iterate through the list (that's where O(n) comes from) and keep track of the k minimal elements seen so far, which can be done in constant time (since k is here a constant).

For keeping track of the minimal elements, you can use a heap (module heapq), or a list. In case of list, time is O(nk) and in case of heap time is O(nlog k). In any case, because k is for you a constant, this whole thing will end up linear in n in total.

Using a list (which is bit simpler to use than the heap, though of course bad if k is large), it may look like this

def k_min(values, k):
    minima = []  # len(minima) <= k + 1
    for v in values:
        minima.append(v)
        if len(minima) > k:
            minima.remove(max(minima))  # O(k) == O(1)
    return minima

If your lists are long, the most efficient way of doing this is via numpy.partition:

>>> def lowest(a, n): return numpy.partition(a, n-1)[:n]
>>> in_list = [6, 4, 3, 2, 5, 1]
>>> lowest(in_list, 3)
array([1, 2, 3])

This executes in O(N) time, unlike a full sort which would operate in O(NlogN) time. The time savings come from not performing a full sort, but only the minimal amount needed to ensure that the n lowest elements are first. Hence the output is not necessarily sorted.

If you need them to be sorted, you can do that afterwards (numpy.sort(lowest(in_list,3)) => array([1,2,3])). For a large array this will still be faster than sorting the whole thing first.

Edit: Here is a comparison of the speed of numpy.partition, heapq.nsmallest and sorted:

>>> a = numpy.random.permutation(np.arange(1000000))
>>> timeit numpy.partition(a, 2)[:3]
100 loops, best of 3: 3.32 ms per loop
>>> timeit heapq.nsmallest(3,a)
1 loops, best of 3: 220 ms per loop
>>> timeit sorted(a)[:3]
1 loops, best of 3: 1.18 s per loop

So numpy.partition is 66 times faster than heapq.nsmallest for an array with a million elements, and it is 355 times faster than sorted. This doesn't mean that you should never use heapq.nsmallest (which is very flexible), but it demonstrates how important it is to avoid plain lists when speed is important.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top