Get every combination of strings

https://stackoverflow.com/questions/1457814

12-09-2019
|

Question

I had a combinatorics assignment that involved getting every word with length less than or equal to 6 from a specific combination of strings.

In this case, it was S = { 'a', 'ab', 'ba' }. The professor just started listing them off, but I thought it would be easier solved with a program. The only problem is that I can't get a good algorithm that would actually compute every possible option.

If anyone could help, I'd appreciate it. I usually program in Python but really I just need help with the algorithm.

Solution

You can iteratively generate all the strings made from one part, two parts, three parts and so on, until all the strings generated in a step are longer than six characters. Further steps would only generate even longer strings, so all possible short strings have already been generated. If you collect these short strings in each step you end up with a set of all possible generated short strings.

In Python:

S = set(['a', 'ab', 'ba'])

collect = set()
step = set([''])
while step:
   step = set(a+b for a in step for b in S if len(a+b) <= 6)
   collect |= step

print sorted(collect)

OTHER TIPS

Assuming you DO mean combinations (no repetitions, order does not matter):

import itertools

S = [ 'a', 'ab', 'ba' ]

for i in range(len(S)+1):
  for c in itertools.combinations(S, i):
    cc = ''.join(c)
    if len(cc) <= 6:
      print c

emits all the possibilities:

()
('a',)
('ab',)
('ba',)
('a', 'ab')
('a', 'ba')
('ab', 'ba')
('a', 'ab', 'ba')

If you mean something different than "combinations", it's just an issue of using the right iterator or generator in the for (e.g., itertools.permutations, or something else of your own devising).

Edit: if for example you mean "repetitions and order ARE important",

def reps(seq, n):
  return itertools.product(*[seq]*n)

for i in range(7):
  for c in reps(S, i):
    cc = ''.join(c)
    if len(cc) <= 6:
      print c

will give you the required 85 lines of output.

Edit again: I had the wrong loop limit (and therefore wrong output length) -- tx to the commenter who pointed that out. Also, this approach can produce a string > 1 times, if the ''.join's of different tuples are considered equivalent; e.g., it produces ('a', 'ba') as distinct from ('ab', 'a') although their ''.join is the same (same "word" from different so-called "combinations", I guess -- terminology in use not being entirely clear).

def combos(S,n):
    if (n <= 0): return
    for s in S:
        if len(s) <= n: yield s
        for t in combos(S,n-len(s)): yield s+t

for x in combos(["a","ab","ba"],6): print x

Prints output:

a
aa
aaa
aaaa
aaaaa
aaaaaa
aaaaab
aaaaba
aaaab
aaaaba
aaaba
aaabaa
aaab
aaaba
aaabaa
aaabab
aaabba
aaba
aabaa
aabaaa
aabaab
aababa
aab
aaba
aabaa
aabaaa
aabaab
aababa
aabab
aababa
aabba
aabbaa
aba
abaa
abaaa
abaaaa
abaaab
abaaba
abaab
abaaba
ababa
ababaa
ab
aba
abaa
abaaa
abaaaa
abaaab
abaaba
abaab
abaaba
ababa
ababaa
abab
ababa
ababaa
ababab
ababba
abba
abbaa
abbaaa
abbaab
abbaba
ba
baa
baaa
baaaa
baaaaa
baaaab
baaaba
baaab
baaaba
baaba
baabaa
baab
baaba
baabaa
baabab
baabba
baba
babaa
babaaa
babaab
bababa

Doing it recursively is one way:

cache = {}
def words_of_length(s, n=6):
    # memoise results
    if n in cache: return cache[n]

    # base cases
    if n < 0: return []
    if n == 0: return [""]

    # inductive case
    res = set()
    for x in s:
        res |= set( (x+y for y in words_of_length(s, n-len(x))) )

    # sort and memoise result
    cache[n] = sorted(res)

    # sort results
    return cache[n]

def words_no_longer_than(s, n=6):
    return sum( [words_of_length(s, i) for i in range(n+1)], [] )

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow