You mention in a comment that the code here is opaque to you. But it's probably the best way of implementing a combinations
function of the kind you're aiming for, and it's worth understanding, so I'll try to explain it in detail.
The basic idea is that given a sequence and a number of items to choose, we can represent each combination as a sequence of indices into the given sequence. So for example, say we have a list ['a', 'b', 'c', 'd', 'e']
, and we want to generate all combinations of two values from that list.
Our first combination looks like this...
['a', 'b', 'c', 'd', 'e']
^ ^
...and is represented by the list of indices [0, 1]
. Our next combination looks like this:
['a', 'b', 'c', 'd', 'e']
^ ^
And is represented by the list of indices [0, 2]
.
We keep moving the second caret forward, keeping the first in place, until the second caret reaches the end. Then we move the first caret to index 1
and "reset" the process by moving the second caret back to index 2
.
['a', 'b', 'c', 'd', 'e']
^ ^
Then we repeat the process, moving the second caret forward until it reaches the end, and then moving the first forward by one and resetting the second.
Now we have to figure out how to do this by manipulating the list of indices. It turns out that this is quite simple. The final combination will look like this:
['a', 'b', 'c', 'd', 'e']
^ ^
And the index representation of this will be [3, 4]
. These are the maximum possible values for the indices, and are equal to i + n - r
, where i
is the position in the list, n
is the number of values (5
in this case), and r
is the number of choices (2
in this case). So as soon as a particular index reaches this value, it can go no higher, and will need to be "reset".
So with that in mind, here's a step-by-step analysis of the code:
def combinations(iterable, r):
pool = tuple(iterable)
n = len(pool)
First, given input based on the above example, pool
would be is our list of characters above converted into a tuple, and n
is simply the number of items in the pool.
if r > n:
return
We can't select more than n
items from an n
item list without replacement, so we simply return in that case.
indices = range(r)
Now we have our indices, initialized to the first combination ([0, 1]
). So we yield it:
yield tuple(pool[i] for i in indices)
Then we generate the remaining combinations, using an infinite loop.
while True:
Inside the loop, we first step backwards through the list of indices searching for an index that hasn't reached it's maximum value yet. We use the formula described above (i + n - r
) to determine the maximum value for a given index. If we find an index that hasn't reached it's maximum value, then we break out of the loop.
for i in reversed(range(r)):
if indices[i] != i + n - r:
break
If we don't find one, then that means that all the indices are at their maximum value, and so we're done iterating. (This uses the little-known for-else
construct; the else
block is executed only if the for
loop terminates normally.)
else:
return
So now we know that index i
needs to be incremented:
indices[i] += 1
Additionally, the indices after i
are all at their maximum values, and so need to be reset.
for j in range(i+1, r):
indices[j] = indices[j-1] + 1
Now we have the next set of indices, so we yield another combination.
yield tuple(pool[i] for i in indices)
There are several variations on this approach; in another, instead of stepping backwards through the indices, you step forward, incrementing the first index that has a "gap" between it and the following index, and resetting the lower indices.
Finally, you could also define this recursively, although pragmatically, the recursive definition probably won't be as efficient.