Iterator over all partitions into k groups?

Question 1

This works, although it is probably super inneficient (I sort them all to avoid double-counting):

def clusters(l, K):
    if l:
        prev = None
        for t in clusters(l[1:], K):
            tup = sorted(t)
            if tup != prev:
                prev = tup
                for i in xrange(K):
                    yield tup[:i] + [[l[0]] + tup[i],] + tup[i+1:]
    else:
        yield [[] for _ in xrange(K)]

It also returns empty clusters, so you would probably want to wrap this in order to get only the non-empty ones:

def neclusters(l, K):
    for c in clusters(l, K):
        if all(x for x in c): yield c

Counting just to check:

def kamongn(n, k):
    res = 1
    for x in xrange(n-k, n):
        res *= x + 1
    for x in xrange(k):
        res /= x + 1
    return res

def Stirling(n, k):
    res = 0
    for j in xrange(k + 1):
        res += (-1)**(k-j) * kamongn(k, j) * j ** n
    for x in xrange(k):
        res /= x + 1
    return res

>>> sum(1 for _ in neclusters([2,3,5,7,11,13], K=3)) == Stirling(len([2,3,5,7,11,13]), k=3)
True

It works !

The output:

>>> clust = neclusters([2,3,5,7,11,13], K=3)
>>> [clust.next() for _ in xrange(5)]
[[[2, 3, 5, 7], [11], [13]], [[3, 5, 7], [2, 11], [13]], [[3, 5, 7], [11], [2, 13]], [[2, 3, 11], [5, 7], [13]], [[3, 11], [2, 5, 7], [13]]]

Question 2

A simple alternative view of this problem is the assignment of one of the three cluster labels to each element.

import itertools
def neclusters(l, k):
    for labels in itertools.product(range(k), repeat=len(l)):
        partition = [[] for i in range(k)]
        for i, label in enumerate(labels):
            partition[label].append(l[i])
        yield partition

as with @val's answer, this can be wrapped to remove partitions with empty clusters.

Question 3

Edited: As noted by @moose, the following only determines partitions where contiguous indices are in the same cluster. Performing this partitioning over all permutations will give the sought answer.

Itertools is very useful for this sort of combinatoric listing. First, we consider your task as the equivalent problem of selecting all sets of k-1 distinct split points in the array. This is solved by itertools.combinations, which returns combinations without replacement of a particular size from a given iterable, and the values it returns are in the order that they are found in the original iterable.

Your problem is thus solved by the following:

import itertools
def neclusters(l, K):
    for splits in itertools.combinations(range(len(l) - 1), K - 1):
        # splits need to be offset by 1, and padded
        splits = [0] + [s + 1 for s in splits] + [None]
        yield [l[s:e] for s, e in zip(splits, splits[1:])]

numpy's split function is designed to make these sorts of partitions given split offsets, so here's an alternative that generates lists of numpy arrays:

import itertools
def neclusters(l, K):
    for splits in itertools.combinations(range(len(l) - 1), K - 1):
        yield np.split(l, 1 + np.array(splits))

Question 4

Filter partitions of size k using more_itertools.partitions (note the trailing "s"):

Given

import itertools as it

import more_itertools as mit


iterable = [2, 3, 5, 7, 11]
k = 3

Demo

res = [p for perm in it.permutations(iterable) for p in mit.partitions(perm) if len(p) == k]
len(res)
# 720

res
# [[[2], [3], [5, 7, 11]],
#  [[2], [3, 5], [7, 11]],
#  [[2], [3, 5, 7], [11]],
#  [[2, 3], [5], [7, 11]],
#  [[2, 3], [5, 7], [11]],
#  [[2, 3, 5], [7], [11]],
#  ...
#  [[3], [2], [5, 7, 11]],
#  [[3], [2, 5], [7, 11]],
#  [[3], [2, 5, 7], [11]],
#  [[3, 2], [5], [7, 11]],
#  [[3, 2], [5, 7], [11]],
#  [[3, 2, 5], [7], [11]],
#  [[3], [2], [5, 11, 7]],
#  ...
# ]

This version gives partitions of a permuted input. Partitions of repeated elements may be included, e.g. [[3,], [5,], [7, 11, 13]] and [[7, 11, 13]], [3,], [5,]].

Note: more_itertools is a third-party package. Install via > pip install more_itertools.

Question 5

A reasonably efficient way is to pivot on the first element in each recursion to force uniqueness, and simply go through combinations of all increasing sizes up to the point it would give empty subsets.

def kpartitions(l, k):
  import itertools
  if k == 1: yield [l]; return
  for i in range(1, len(l)-k+1+1):
    s = set(range(1, len(l)))
    for comb in itertools.combinations(s, i-1):
      for t in kpartitions([l[idx] for idx in s - set(comb)], k-1):
        yield [[l[0], *(l[idx] for idx in comb)], *t]
def stirlingsecond(n, k):
  import math
  return sum((-1 if (i & 1 != 0) else 1) * math.comb(k, i)*((k-i)**n)
    for i in range(k+1)) // math.factorial(k)
assert len(list(kpartitions([3,5,7,11,13], 3))) == stirlingsecond(5, 3)
assert len(list(kpartitions([2,3,5,7,11,13], 3))) == stirlingsecond(6, 3)

This is quite efficient though it does a little extra work to find the elements not in the combinations, as itertools.combinations is convenient, though writing a combination function which yields both the combination and those elements not in it might give a constant time improvement.