Question

Some background: I'm writing a more or less brute force search algorithm for solving a problem that I have. In order to do this, I need to generate and evaluate all possibilities to find out which is best. Since the evaluation actually takes some time I would prefer to generate as little as possible solutions that completely cover my search space. Furthermore, the more elements I can do this for, the better. For any number K there are normally K! permutations, and generating them all will be hard for numbers higher than ~10.

Real problem: The search space should contain all permutations of two elements (N times el1 and M times el2, where K=M+N), with these restrictions:

  1. they have to be unique (i.e. I only want [a a b b b] once)
  2. I don't need the reverse of any permutation (i.e. if I have [a a b], I don't also need [b a a])
  3. I consider the permutations to be circular, so [a a b] = [a b a] = [b a a]

If I would be able to do this, the number of possibilities would be decreased drastically. Since K will ideally be large, it is not feasible to first generate all permutations and then filter them according to these criteria. I have already done the first restriction (see below) and it cut back the number from 2^K for Matlab's normal permutations function (perms) to K!/N!M!, which is a huge win. The second restriction will only cut the number of possiblities in half (in the best case), but I think the third should also be able to really cut down the number of possibilities.

If anyone knows how to do it, and preferably also how to calculate how many possibilities there will be, that would help me a lot! I would prefer an explanation, but code is also fine (I can read C-like languages, Java(Script), Python, Ruby, Lisp/Scheme).


For the interested: Here is the algorithm for getting only unique permutations that I have so far:

function genPossibilities(n, m, e1, e2)
     if n == 0
         return array of m e2's
     else
         possibilities = genPossibilities(n-1, m, e1, e2)
         for every possibility:
             gain = number of new possibilities we'll get for this smaller possibility*
             for i in max(0,(m+n-gain))
                 if possibility(i) is not e1
                     add possiblity with e1 inserted in position i
         return new possibilities
  • If you have all permutations for N-1 and M, then you can use them to find the permutations for N and M by inserting e1 into them. You can't just insert everywhere though, because then you'll get duplicates. I don't know why this works, but you can calculate the number of new possibilities that you'll generate from an old one (I call this 'gain'). This number starts at M+1 for the first old permutation and decreases by one for each old permutation until it would become zero, at which point it goes back to M, etc. (only works if M>=N). So if you want to calculate the permutations for N=3 and M=3 and you have the 10 permutations for N=2 and M=3, their gains will be [4 3 2 1 3 2 1 2 1 1]. Subtract this gain from the length of the permutation and you get the index at which you can start inserting new elements without making duplicates.
Was it helpful?

Solution

What you are after is a subset of 2-ary bracelets (the subset is defined by exactly n of character A and m of character B). The set of all bracelets allows for the number of A's and B's to vary.

The following code prints out the sequences you are after, and does so in lexical order and in constant amortised time. It is based on the general algorithm in this paper by Sawada - for an explanation of how it works, see that paper.

#include <stdlib.h>
#include <stdio.h>

static int *a;
static int n;

void print_bracelet(int n, int a[])
{
    int i;

    printf("[");
    for (i = 0; i < n; i++)
        printf(" %c", 'a' + a[i]);
    printf(" ]\n");
}

int check_rev(int t, int i)
{
    int j;

    for (j = i+1; j <= (t + 1)/2; j++)
    {
        if (a[j] < a[t-j+1])
            return 0;
        if (a[j] > a[t-j+1])
            return -1;
    }

    return 1;
}

void gen_bracelets(int n_a, int n_b, int t, int p, int r, int u, int v, int rs)
{
    if (2 * (t - 1) > (n + r))
    {
        if (a[t-1] > a[n-t+2+r])
            rs = 0;
        else if (a[t-1] < a[n-t+2+r])
            rs = 1;
    }
    if (t > n)
    {
        if (!rs && (n % p) == 0)
            print_bracelet(n, a + 1);
    }
    else
    {
        int n_a2 = n_a;
        int n_b2 = n_b;

        a[t] = a[t-p];

        if (a[t] == 0)
            n_a2--;
        else
            n_b2--;

        if (a[t] == a[1])
            v++;
        else
            v = 0;

        if ((u == (t - 1)) && (a[t-1] == a[1]))
            u++;

        if ((n_a2 >= 0) && (n_b2 >= 0) && !((t == n) && (u != n) && (a[n] == a[1])))
        {
            if (u == v) {
                int rev = check_rev(t, u);

                if (rev == 0)
                    gen_bracelets(n_a2, n_b2, t + 1, p, r, u, v, rs);

                if (rev == 1)
                    gen_bracelets(n_a2, n_b2, t + 1, p, t, u, v, 0);
            }
            else
                gen_bracelets(n_a2, n_b2, t + 1, p, r, u, v, rs);
        }

        if (u == t)
            u--;

        if (a[t-p] == 0 && n_b > 0)
        {
            a[t] = 1;

            if (t == 1)
                gen_bracelets(n_a, n_b - 1, t + 1, t, 1, 1, 1, rs);
            else
                gen_bracelets(n_a, n_b - 1, t + 1, t, r, u, 0, rs);
        }
    }
}

int main(int argc, char *argv[])
{
    int n_a, n_b;

    if (argc < 3)
    {
        fprintf(stderr, "Usage: %s <a> <b>\n", argv[0]);
        return -2;
    }

    n_a = atoi(argv[1]);
    n_b = atoi(argv[2]);

    if (n_a < 0 || n_b < 0)
    {
        fprintf(stderr, "a and b must be nonnegative\n");
        return -3;
    }

    n = n_a + n_b;
    a = malloc((n + 1) * sizeof(int));

    if (!a)
    {
        fprintf(stderr, "could not allocate array\n");
        return -1;
    }

    a[0] = 0;

    gen_bracelets(n_a, n_b, 1, 1, 0, 0, 0, 0);

    free(a);
    return 0;
}

OTHER TIPS

I think you want to generate 2-ary free necklaces. See this question for link, papers, and some code.

You are looking for combinations - which are order independent. Matlab calculated this correctly with K!/N!M! which is precisely the formula for calculating the number of combinations.

Assuming you have an array of all permutations, you can put the contents of the array into a hash. Then this will work (a little brute force, but its a start):

for each (element in array of permutations){
  if (element exists in hash){
    remove each circular permutation of element in hash except for element itself
  }
}

If you have only two elements, your space is much smaller: 2^k rather than k!.

Try an approach like this:

  1. Run through all numbers from 1 through 2^k.
  2. Write the number in binary form.
  3. Translate all 0s to a and 1s to b. Now you have a permutation.
  4. Take your sequence and generate 2k sequences by cyclic permutations and reversal. You only need to evaluate 1 of these 2k sequences.
  5. Among the 2k sequences, choose the one that sorts first alphabetically.
  6. Check your log to see if you've already done this one. If so, skip.
  7. If this one is new, evaluate it, and add to the "done" log. (If space permits, you could add all 2k elements of the "family" to the done log, so you could move step (6) to right after step (3). You could also store the number, rather than the sequence of a's and b's, in the "done" log.)

If you have j possible symbols, rather than just two, do the same thing but use base j rather than base 2.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top