Question

I have a script that loops through a series of four (or less) characters strings. For example:

aaaa
aaab
aaac
aaad

If have been able to implement it with nested for loops like so:

chars = string.digits + string.uppercase + string.lowercase

for a in chars:
    print '%s' % a   
    for b in chars:
        print '%s%s' % (a, b)
        for c in chars:
            print '%s%s%s' % (a, b, c)
            for d in chars:
                print '%s%s%s%s' % (a, b, c, d)

Is this sort of loop nesting a bad thing, and if so, what would be a better way of accomplishing what I am doing?

Was it helpful?

Solution

import string
import itertools

chars = string.digits + string.letters
MAX_CHARS = 4
for nletters in range(MAX_CHARS):
    for word in itertools.product(chars, repeat=nletters + 1):
        print (''.join(word))

That'll print all 15018570 words you're looking for. If you want more/less words just change the MAX_CHARS variable. It will still have just two fors for any number of chars, and you don't have to repeat yourself. And is pretty readable. .

OTHER TIPS

I'm going to submit my answer as the most readable and least scalable :)

import string
chars = [''] + list(string.lowercase)

strings = (a+b+c+d for a in chars
                   for b in chars
                   for c in chars
                   for d in chars)

for string in strings:
    print string

EDIT: Actually, this is incorrect, as it will produce duplicates of all strings of length<4. Removing the empty string from the chars array would just produce 4-char strings.

Normally I'd delete this answer, but I still kinda like it if you need to generate strings of the same length.

Write for the programmer first - the computer second.
If it's clear and obvious to understand then its correct.

If speed matters AND the compiler doesn't optimise it anyway AND if you measure it AND it is the problem - then think of a faster cleverer way!

I don't think it's a bad thing, provided you understand (and document :-) it. I don't doubt there may be a more pythonic way or clever solution (with lambdas or whatnot) but I've always favored readability over cleverness.

Since you have to generate all possibilities of 1-, 2-, 3- and 4-character "words", this method is as good as any. I'm not sure how long it would take since you're effectively generating (very roughly) 14 million lines of output (but probably every solution would have that problem).

Pre-calculating the common prefixes may provide a speed boost but you'd be better off measuring it to check (always check, never assume):

chars = string.digits + string.uppercase + string.lowercase
for a in chars:
    print a
    for b in chars:
        ab = '%s%s' % (a, b)
        print ab
        for c in chars:
            abc = '%s%s' % (ab, c)
            print abc
            for d in chars:
                print '%s%s' % (abc, d)

EDIT: I actually did some benchmarks (with Windows-Python 2.6.1) - this version takes about 2.25 time units compared to the original 2.84 so it's 26% faster. I think that might warrant its use (again, as long as it's documented clearly what it's trying to achieve).

@nosklo's and @Triptych's solutions produce different results:

>>> list(map(''.join, itertools.chain.from_iterable(itertools.product("ab", 
...     repeat=r) for r in range(4)))) # @nosklo's 
['', 'a', 'b', 'aa', 'ab', 'ba', 'bb', 'aaa', 'aab', 'aba', 'abb', 'baa', 
 'bab', 'bba', 'bbb']
>>> ab = ['']+list("ab")
>>> list(map(''.join, (a+b+c for a in ab for b in ab for c in ab)))  
['', 'a', 'b', 'a', 'aa', 'ab', 'b', 'ba', 'bb', 'a', 'aa', 'ab', 'aa', 
 'aaa', 'aab', 'ab', 'aba', 'abb', 'b', 'ba', 'bb', 'ba', 'baa', 'bab', 
 'bb',  'bba', 'bbb']

Here's modified @Triptych's solution that produce the same output as the @nosklo's one:

>>> ab = "ab"
>>> list(map(''.join, itertools.chain([''], ab, (a+b for a in ab for b in ab),
...     (a+b+c for a in ab for b in ab for c in ab))))
['', 'a', 'b', 'aa', 'ab', 'ba', 'bb', 'aaa', 'aab', 'aba', 'abb', 'baa', 
 'bab', 'bba', 'bbb']

There are many algorithms for generating every permutation of a set. What you want here is a related problem, but not directly analagous. Suggested Reading

It doesn't exactly answer the question, but this would return the nth combination for the given maximum length and characters in the alphabet to use:

#!/usr/bin/python

def nth_combination(n, maxlen=4, alphabet='abc'):
    """
    >>> print ','.join(nth_combination(n, 1, 'abc') for n in range(3))
    a,b,c
    >>> print ','.join(nth_combination(n, 2, 'abc') for n in range(12))
    a,aa,ab,ac,b,ba,bb,bc,c,ca,cb,cc
    >>> import string ; alphabet = string.ascii_letters + string.digits
    >>> print ','.join(nth_combination(n, 4, alphabet) for n in range(16))
    a,aa,aaa,aaaa,aaab,aaac,aaad,aaae,aaaf,aaag,aaah,aaai,aaaj,aaak,aaal,aaam
    >>> print ','.join(nth_combination(n, 4, alphabet)
    ...                for n in range(0, 14000000, 10**6))
    a,emiL,iyro,mKz2,qWIF,u8Ri,zk0U,Dxav,HJi9,LVrM,P7Ap,UjJ1,YvSE,2H1h
    """
    if maxlen == 1:
        return alphabet[n]
    offset, next_n = divmod(n, 1 + len(alphabet)**(maxlen-1))
    if next_n == 0:
        return alphabet[offset]
    return alphabet[offset] + nth_combination(next_n-1, maxlen-1, alphabet)

if __name__ == '__main__':
    from doctest import testmod
    testmod()

This of course makes sense only if you need random access to the set of combinations instead of always iterating through them all.

If maxlen is high, some speed optimization could be achieved e.g. by getting rid of string concatenation and re-calculating the length of alphabet and maxlen-1 at each level of the recursion. A non-recursive approach might make sense, too.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top