문제

I want to filter a list of all items containing the same last 4 digits, I want to print the longest of them.

For example:

lst = ['abcd1234','abcdabcd1234','gqweri7890','poiupoiupoiupoiu7890']
# want to return abcdabcd1234 and poiupoiupoiupoiu7890

In this case, we print the longer of the elements containing 1234, and the longer of the elements containing 7890. Finding the longest element containing a certain element is not hard, but doing it for all items in the list (different last four digits) efficiently seems difficult.

My attempt was to first identify all the different last 4 digits using list comprehension and slice:

ids=[]
for x in lst:
    ids.append(x[-4:])
ids = list(set(ids))

Next, I would search through the list by index, with a "max_length" variable and "current_id" to find the largest elements of each id. This is clearly very inefficient and was wondering what the best way to do this would be.

도움이 되었습니까?

해결책

Use a dictionary:

>>> lst = ['abcd1234','abcdabcd1234','gqweri7890','poiupoiupoiupoiu7890']
>>> d = {} # to keep the longest items for digits.
>>> for item in lst:
...     key = item[-4:] # last 4 characters
...     d[key] = max(d.get(key, ''), item, key=len)
...
>>> d.values() # list(d.values()) in Python 3.x
['abcdabcd1234', 'poiupoiupoiupoiu7890']

다른 팁

from collections import defaultdict
d = defaultdict(str)
lst = ['abcd1234','abcdabcd1234','gqweri7890','poiupoiupoiupoiu7890']
for x in lst:
    if len(x) > len(d[x[-4:]]):
        d[x[-4:]] = x

To display the results:

for key, value in d.items():
    print key,'=', value

which produces:

1234 = abcdabcd1234
7890 = poiupoiupoiupoiu7890

itertools is great. Use groupby with a lambda to group the list into the same endings, and then from there it is easy:

>>> from itertools import groupby
>>> lst = ['abcd1234','abcdabcd1234','gqweri7890','poiupoiupoiupoiu7890']
>>> [max(y, key=len) for x, y in groupby(lst, lambda l: l[-4:])]
['abcdabcd1234', 'poiupoiupoiupoiu7890']

Slightly more generic

import string
import collections
lst = ['abcd1234','abcdabcd1234','gqweri7890','poiupoiupoiupoiu7890']
z = [(x.translate(None, x.translate(None, string.digits)), x) for x in lst]
x = collections.defaultdict(list)
for a, b in z:
  x[a].append(b)

for k in x:
  print k, max(x[k], key=len)
1234 abcdabcd1234                                                               
7890 poiupoiupoiupoiu7890      
라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top