Find n-th set of a powerset

Question 1

I don't have a closed form for the function, but I do have a bit-hacking non-looping next_combination function, which you're welcome to, if it helps. It assumes that you can fit the bit mask into some integer type, which is probably not an unreasonable assumption given that there are 2⁶⁴ possibilities for the 64-element set.

As the comment says, I find this definition of "lexicographical ordering" a bit odd, since I'd say lexicographical ordering would be: [], [a], [ab], [abc], [ac], [b], [bc], [c]. But I've had to do the "first by size, then lexicographical" enumeration before.

// Generate bitmaps representing all subsets of a set of k elements,
// in order first by (ascending) subset size, and then lexicographically.
// The elements correspond to the bits in increasing magnitude (so the
// first element in lexicographic order corresponds to the 2^0 bit.)
//
// This function generates and returns the next bit-pattern, in circular order
// (so that if the iteration is finished, it returns 0).
//
template<typename UnsignedInteger>
UnsignedInteger next_combination(UnsignedInteger comb, UnsignedInteger mask) {
  UnsignedInteger last_one = comb & -comb;
  UnsignedInteger last_zero = (comb + last_one) &~ comb & mask;
  if (last_zero) return comb + last_one + (last_zero / (last_one * 2)) - 1;
  else if (last_one > 1) return mask / (last_one / 2);
  else return ~comb & 1;
}

Line 5 is doing the bit-hacking equivalent of the (extended) regular expression replacement, which finds the last 01 in the string, flips it to 10 and shifts all the following 1s all the way to the right.

s/01(1*)(0*)$/10\2\1/

Line 6 does this one (only if the previous one failed) to add one more 1 and shift the 1s all the way to the right:

s/(1*)0(0*)/\21\1/

I don't know if that explanation helps or hinders :)

Here's a quick and dirty driver (the command-line argument is the size of the set, default 5, maximum the number of bits in an unsigned long):

#include <iostream>

template<typename UnsignedInteger>
std::ostream& show(std::ostream& out, UnsignedInteger comb) {
  out << '[';
  char a = 'a';
  for (UnsignedInteger i = 1; comb; i *= 2, ++a) {
    if (i & comb) {
      out << a;
      comb -= i;
    }
  }
  return out << ']';
}

int main(int argc, char** argv) {
  unsigned int n = 5;
  if (argc > 1) n = atoi(argv[1]);
  unsigned long mask = (1UL << n) - 1;
  unsigned long comb = 0;
  do {
    show(std::cout, comb) << std::endl;
    comb = next_combination(comb, mask);
  } while (comb);
  return 0;
}

It's hard to believe that this function might be useful for a set of more than 64 elements, given the size of the enumeration, but it might be useful to enumerate some limited part, such as all subsets of three elements. In this case, the bit-hackery is only really useful if the modification fits in a single word. Fortunately, that's easy to test; you simply need to do the computation as above on the last word in the bitset, up to the test for last_zero being zero. (In this case, you don't need to bitand mask, and indeed you might want to choose a different way of specifying the set size.) If last_zero turns out to be zero (which will actually be pretty rare), then you need to do the transformation in some other way, but the principle is the same: find the first 0 which precedes a 1 (watch out for the case where the 0 is at the end of a word and the 1 at the beginning of the next one); change the 01 to 10, figure out how many 1s you need to move, and move them to the end.

Question 2

Considering a list of elements L = [a, b, c], the powerset of L is given by:

P(L) = {
    [],
    [a], [b], [c],
    [a, b], [a, c], [b, c],
    [a, b, c]
}

Considering each position as a bit, you'd have the mappings:

id  | positions - integer | desired set
 0  |  [0 0 0]  -    0    |  []
 1  |  [1 0 0]  -    4    |  [a]
 2  |  [0 1 0]  -    2    |  [b]
 3  |  [0 0 1]  -    1    |  [c]
 4  |  [1 1 0]  -    6    |  [a, b]
 5  |  [1 0 1]  -    5    |  [a, c]
 6  |  [0 1 1]  -    3    |  [b, c]
 7  |  [1 1 1]  -    7    |  [a, b, c]

As you see, the id is not directly mapped to the integers. A proper mapping needs to be applied, so that you have:

id  | positions - integer |  mapped  - integer
 0  |  [0 0 0]  -    0    |  [0 0 0] -    0
 1  |  [1 0 0]  -    4    |  [0 0 1] -    1
 2  |  [0 1 0]  -    2    |  [0 1 0] -    2
 3  |  [0 0 1]  -    1    |  [0 1 1] -    3
 4  |  [1 1 0]  -    6    |  [1 0 0] -    4
 5  |  [1 0 1]  -    5    |  [1 0 1] -    5
 6  |  [0 1 1]  -    3    |  [1 1 0] -    6
 7  |  [1 1 1]  -    7    |  [1 1 1] -    7

As an attempt on solving this, I came up using a binary tree to do the mapping -- I'm posting it so that someone may see a solution from it:

                                        #
                          ______________|_____________
        a               /                             \
                  _____|_____                   _______|______
        b        /           \                 /              \
              __|__         __|__           __|__            __|__
        c    /     \       /     \         /     \          /     \
           [ ]     [c]    [b]   [b, c]    [a]   [a, c]    [a, b]  [a, b, c]
index:      0       3      2       6       1      5         4         7

Question 3

Suppose your set has size N.

So, there are (N choose k) sets of size k. You can find the right k (i.e. the size of the nth set) very quickly just by subtracting off (N choose k) from n until n is about to go negative. This reduces your problem to finding the nth k-subset of an N-set.

The first (N-1 choose k-1) k-subsets of your N-set will contain its least element. So, if n is less than (N-1 choose k-1), pick the first element and recurse on the rest of the set. Otherwise, you have one of the (N-1 choose k) other sets; throw away the first element, subtract (N-1 choose k-1) from n, and recurse.

Code:

#include <stdio.h>

int ch[88][88];
int choose(int n, int k) {
 if (n<0||k<0||k>n) return 0;
 if (!k||n==k) return 1;
 if (ch[n][k]) return ch[n][k];
 return ch[n][k] = choose(n-1,k-1) + choose(n-1,k);
}

int nthkset(int N, int n, int k) {
 if (!n) return (1<<k)-1;
 if (choose(N-1,k-1) > n) return 1 | (nthkset(N-1,n,k-1) << 1);
 return nthkset(N-1,n-choose(N-1,k-1),k)<<1;
}

int nthset(int N, int n) {
 for (int k = 0; k <= N; k++)
  if (choose(N,k) > n) return nthkset(N,n,k);
  else n -= choose(N,k);
 return -1; // not enough subsets of [N].
}

int main() {
 int N,n;
 scanf("%i %i", &N, &n);
 int a = nthset(N,n);
 for (int i=0;i<N;i++) printf("%i", !!(a&1<<i));
 printf("\n");
}