Question

I have a long list of longitude values (len(Lon) = 420481), and another one of latitude values. I want to find the corresponding latitude to the minimum of the longitude.

I tried:

SE_Lat = [Lat[x] for x,y in enumerate(Lon) if y == min(Lon)]

but this takes ages to finish.

Does anyone know a more efficient way?

Maybe you also have a suggestions for this: I now try to find the closest corresponding latitude to a new longitude, which is not in the original longitude vector. I tried this:

minDiff = [min(abs(x - lon_new) for x in lons)] # not very quick, but works
[(lat,lon) for lat,lon in izip(lats,lons) if abs(lon-lon_new)==minDiff]

The last line throws an error, because there are multiple matches. I don't know at the moment how to find only one value, lets say the first. Any help is greatly appreciated!

Was it helpful?

Solution

May I recommend numpy?

import numpy
nplats = numpy.array(lats)
nplons = numpy.array(lons)

# this part is 20x faster than using the built-in python functions
index = numpy.argmin(nplats)

print nplats[index], nplons[index]

this is way faster than the min(izip()) solution (~20x using my setup when using 420481 randomly created records), although of course you'd need to store your data values in numpy to take advantage of this speed-up.

OTHER TIPS

min(itertools.izip(Lat, Lon), key=operator.itemgetter(1))[0]

Rather than jumping right in with one of the many alternatives for solving this (which can be seen in the other answers), it's worth enumerating why the code in the original example is so slow.

SE_Lat = [Lat[x] for x,y in enumerate(Lon) if y == min(Lon)]

We know from the OP that len(Lon) == 420481. Now, finding the minimum value is an O(N) operation (you have to look at every value at least once). In a list comprehension, the condition is reevaluated on every iteration. The above code recalculates the minimum value on every pass through the loop, blowing what should be an O(N) operation out to be O(N^2) (A mere 177 billion iterations in this case).

Simply caching the result of min(Lon) in a local variable and using that in the loop condition instead of recalculating it every iteration would likely bring the runtime down to an acceptable level.

However, the way I would personally go about it (assuming I wanted all of the latitude, longitude and index later on):

min_longitude, min_index = min(longitude, index for index, longitude in enumerate(Lon))
min_latitude = Lat[min_index]

There are plenty of possibilities though, and which one is best will vary based on the exact use case.

Just first find the index:

index = min(enumerate(Lon), key=operator.itemgetter(1))[1] 
Lat[index]
pairs = zip(latitudes, longitudes)
minLonPair = min(pairs, key=lambda p:p[1])
print(minLonPair[0])

As per Ignacio's solution, if you are using python2, you will want to use izip rather than zip. This is, however, true for everything you do in python2.

Here was my original answer:

>>> lats = [1,2,3,4]
>>> lons = [5,4,8,9]
>>> from itertools import izip
>>> min(izip(lats,lons), key=lambda x:x[1])
(2, 4)

But I see that the OP seemed to be allowing for there being multiple matches at the minimum lon value, and for this, I don't think there is a one-liner. The trick is, you only want to find min(lons) once, not once for every lat,lon pair:

>>> lats = [1,2,3,4]
>>> lons = [5,4,8,4]
>>> minlon = min(lons)
>>> [(lat,lon) for lat,lon in izip(lats,lons) if lon==minlon]
[(2, 4), (4, 4)]

This one-liner might work for you, since the lambda argument minlon should only be computed once:

>>> filter(lambda latlon,minlon=min(lons):latlon[1]==minlon, izip(lats,lons))
[(2, 4), (4, 4)]

Not sure how well it will work on 420481-element lists though. And for readability and long-term support, I would probably choose the more explicit 2-liner solution.

Last point: Sometimes you only get one pass through a sequence, such as when it is an iterator, or the output of a generator. To support multiple matches and take only one pass through the two lists, this was the best I could do:

from itertools import izip

def get_lats_at_min_lon(lats, lons):
    minlon = 200
    minlats = []
    for lat,lon in izip(lats, lons):
        if lon < minlon:
            minlats = [lat]
            minlon = lon
        elif lon == minlon:
            minlats.append(lat)
    return minlon, minlats

lats = iter([1,2,3,4])
lons = iter([5,4,8,4])

print get_lats_at_min_lon(lats,lons)

Prints:

(4, [2, 4])
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top