Question

I have a list of names alphabetically, like:

list = ['ABC', 'ACE', 'BED', 'BRT', 'CCD', ..]

How can I get element from each starting letter? Do I have to iterate the list one time? or Does python has some function to do it? New to python, this may be a really naive problem.

Suppose I want to get the second element from names that starts from 'A', this case I get 'ACE'.

Was it helpful?

Solution

If you're going to do multiple searches, you should take the one-time hit of iterating through everything and build a dictionary (or, to make it simpler, collections.defaultdict):

from collections import defaultdict

d = defaultdict(list)

words = ['ABC', 'ACE', 'BED', 'BRT', 'CCD', ...]

for word in words:
    d[word[0]].append(word)

(Note that you shouldn't name your own variable list, as it shadows the built-in.)

Now you can easily query for the second word starting with "A":

d["A"][1] == "ACE"

or the first two words for each letter:

first_two = {c: w[:2] for c, w in d.items()}

OTHER TIPS

Using generator expression and itertools.islice:

>>> import itertools
>>> names = ['ABC', 'ACE', 'BED', 'BRT', 'CCD']
>>> next(itertools.islice((name for name in names if name.startswith('A')), 1, 2), 'no-such-name')
'ACE'

>>> names = ['ABC', 'BBD', 'BED', 'BRT', 'CCD']
>>> next(itertools.islice((name for name in names if name.startswith('A')), 1, 2), 'no-such-name')
'no-such-name'

Simply group all the elements by their first char

from itertools import groupby
from operator import itemgetter

example = ['ABC', 'ACE', 'BED', 'BRT', 'CCD']


d = {g:list(values) for g, values in groupby(example, itemgetter(0))}

Now to get a value starting with a:

print d.get('A', [])

This is most usefull when you have a static list and will have multiple queries since as you may see, getting the 3rd item starting with 'A' is done in O(1)

You might want to use list comprehensions

mylist = ['ABC', 'ACE', 'BED', 'BRT', 'CCD']
elements_starting_with_A = [i for i in mylist if i[0] == 'A']
>>> ['ABC', 'ACE']
second = elements_starting_with_A[1]
>>> 'ACE'

In addition to list comprehension as others have mentioned, lists also have a sort() method.

mylist = ['AA', 'BB', 'AB', 'CA', 'AC']
newlist = [i for i in mylist if i[0] == 'A']
newlist.sort()
newlist
>>> ['AA', 'AB', 'AC']

The simple solution is to iterate over the whole list in O(n) :

(name for name in names if name.startswith('A'))

However you could sort the names and search in O(log(n)) for the item which is supposed to be on the index or after (using lexicographic comparison). The module bisect will help you to find the bounds :

from bisect import bisect_left

names = ['ABC', 'ACE', 'BED', 'BRT', 'CCD']

names.sort() 

lower = bisect_left(names, 'B')
upper = bisect_left(names, chr(1+ord('B')))

print [names[i] for i in range(lower, upper)] 
# ['BED', 'BRT']
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top