Question

One way (the fastest way?) to iterate over a pair of iterables a and b in sorted order is to chain them and sort the chained iterable:

for i in sorted(chain(a, b)):
    print i

For instance, if the elements of each iterable are:

a: 4, 6, 1
b: 8, 3

then this construct would produce elements in the order

1, 3, 4, 6, 8

However, if the iterables iterate over objects, this sorts the objects by their memory address. Assuming each iterable iterates over the same type of object,

  1. What is the fastest way to iterate over a particular attribute of the objects, sorted by this attribute?

  2. What if the attribute to be chosen differs between iterables? If iterables a and b both iterate over objects of type foo, which has attributes foo.x and foo.y of the same type, how could one iterate over elements of a sorted by x and b sorted by y?

For an example of #2, if

a: (x=4,y=3), (x=6,y=2), (x=1,y=7)
b: (x=2,y=8), (x=2,y=3)

then the elements should be produced in the order

1, 3, 4, 6, 8

as before. Note that only the x attributes from a and the y attributes from b enter into the sort and the result.

Was it helpful?

Solution

Tim Pietzcker has already answered for the case where you're using the same attribute for each iterable. If you're using different attributes of the same type, you can do it like this (using complex numbers as a ready-made class that has two attributes of the same type):

In Python 2:

>>> a = [1+4j, 7+0j, 3+6j, 9+2j, 5+8j]
>>> b = [2+5j, 8+1j, 4+7j, 0+3j, 6+9j]
>>> keyed_a = ((n.real, n) for n in a)
>>> keyed_b = ((n.imag, n) for n in b)
>>> from itertools import chain
>>> sorted_ab = zip(*sorted(chain(keyed_a, keyed_b), key=lambda t: t[0]))[1]
>>> sorted_ab
((1+4j), (8+1j), (3+6j), 3j, (5+8j), (2+5j), (7+0j), (4+7j), (9+2j), (6+9j))

Since in Python 3 zip() returns an iterator, we need to coerce it to a list before attempting to subscript it:

>>> # ... as before up to 'from itertools import chain'
>>> sorted_ab = list(zip(*sorted(chain(keyed_a, keyed_b), key=lambda t: t[0])))[1]
>>> sorted_ab
((1+4j), (8+1j), (3+6j), 3j, (5+8j), (2+5j), (7+0j), (4+7j), (9+2j), (6+9j))

OTHER TIPS

Answer to question 1: You can provide a key attribute to sorted(). For example if you want to sort by the object's .name, then use

sorted(chain(a, b), key=lambda x: x.name)

As for question 2: I guess you'd need another attribute for each object (like foo.z, as suggested by Zero Piraeus) that can be accessed by sorted(), since that function has no way of telling where the object it's currently sorting used to come from. After all, it is receiving a new iterator from chain() which doesn't contain any information about whether the current element is from a or b.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top