Algorithm for permutations of a string of characters, with no duplicate orders

https://softwareengineering.stackexchange.com/questions/391131

24-02-2021
|

Question

I've seen a lot of good questions about similar problems but not exactly what I'm looking for.

Given: a non-unique set of symbols (or characters)

(1, 1, 2, 3)

Expected: an output similar to this (the order isn't important, only the elements outputed)

I know I could pipe the permutation output into another algorithm similar to the uniq bash command but Im going to be using this on large strings and I want the "uniqueness" done on the fly not after the fact.

I imagine the easiest way to do something like this is to use a hash in the permutation algorithm with the permutations being the key but is there a smarter way to do this? Am I missing a cool trick here?

Edit: For better examples, I've provided two more

# given a set (1,1,1,2,3)
[1, 1, 1, 2, 3]
[1, 1, 1, 3, 2]
[1, 1, 2, 1, 3]
[1, 1, 2, 3, 1]
[1, 1, 3, 1, 2]
[1, 1, 3, 2, 1]
[1, 2, 1, 1, 3]
[1, 2, 1, 3, 1]
[1, 2, 3, 1, 1]
[1, 3, 1, 1, 2]
[1, 3, 1, 2, 1]
[1, 3, 2, 1, 1]
[2, 1, 1, 1, 3]
[2, 1, 1, 3, 1]
[2, 1, 3, 1, 1]
[2, 3, 1, 1, 1]
[3, 1, 1, 1, 2]
[3, 1, 1, 2, 1]
[3, 1, 2, 1, 1]
[3, 2, 1, 1, 1]

# given a set (1, 1, 2, 2, 3)
[1, 1, 2, 2, 3]
[1, 1, 2, 3, 2]
[1, 1, 3, 2, 2]
[1, 2, 1, 2, 3]
[1, 2, 1, 3, 2]
[1, 2, 2, 1, 3]
[1, 2, 2, 3, 1]
[1, 2, 3, 1, 2]
[1, 2, 3, 2, 1]
[1, 3, 1, 2, 2]
[1, 3, 2, 1, 2]
[1, 3, 2, 2, 1]
[2, 1, 1, 2, 3]
[2, 1, 1, 3, 2]
[2, 1, 2, 1, 3]
[2, 1, 2, 3, 1]
[2, 1, 3, 1, 2]
[2, 1, 3, 2, 1]
[2, 2, 1, 1, 3]
[2, 2, 1, 3, 1]
[2, 2, 3, 1, 1]
[2, 3, 1, 1, 2]
[2, 3, 1, 2, 1]
[2, 3, 2, 1, 1]
[3, 1, 1, 2, 2]
[3, 1, 2, 1, 2]
[3, 1, 2, 2, 1]
[3, 2, 1, 1, 2]
[3, 2, 1, 2, 1]
[3, 2, 2, 1, 1]

Solution

One possible way could be to first convert your string in a map of character to integers.

Then you would iterate over the keys of the map to select the next possible character.

#include <iostream>
#include <map>

std::map<char, size_t>  getCharacterMap(const std::string &s)
  {
  std::map<char, size_t> charactersCount;
  for(std::string::size_type i = 0; i < s.size(); ++i)
    ++charactersCount[s[i]];
  return charactersCount;
  }


void recursivePermutation(const std::string &partialPermutation, const std::map<char, size_t> &charCount)
  {
  if (charCount.empty())
    {
    std::cout << partialPermutation << std::endl;
    return;
    }
  for (std::map<char, size_t>::const_iterator it = charCount.begin(); it != charCount.end(); ++it)
    {
    char key = it->first;
    size_t count = it->second;
    std::string longerPartialPermutation = partialPermutation;
    longerPartialPermutation += key;
    std::map<char, size_t> reducedCount = charCount;
    if (count == 1)
      reducedCount.erase(reducedCount.find(key));
    else
       reducedCount[key]-=1;
    recursivePermutation(longerPartialPermutation, reducedCount);
    }

  }

void permutation(const std::string &s)
  {
  std::map<char, size_t> charCount = getCharacterMap(s);
  recursivePermutation("", charCount);
  }

int main()
  {
  permutation("1122");
  return 0;
  }

OTHER TIPS

This is exactly the example use of the C++ standard library function std::next_permutation, starting from sorted data.

#include <algorithm>
#include <string>
#include <iostream>

int main()
{
    std::string s = "aba";
    std::sort(s.begin(), s.end());
    do {
        std::cout << s << '\n';
    } while(std::next_permutation(s.begin(), s.end()));
}

Has the output

aab
aba
baa

It doesn't include the duplicate cases with "swapped" as.

Adapting the possible implementation from the linked reference to another language shouldn't be too difficult, e.g. C#:

class PermutationExtensions {
    private static void Swap<T>(IList<T> list, int indexA, int indexB)
    {
        T tmp = list[indexA];
        list[indexA] = list[indexB];
        list[indexB] = tmp;
    }

    private static bool NextPermutation(IList<T> list)
    {
        if (list.Count() < 2) return false;
        int i = list.Count() - 1;

        while (true) {
            int i1 = i;
            if (list[--i] < list[i1]) {
                int i2 = list.Count();
                while (!(list[i] < list[--i2]))
                    ;
                Swap(list, i, i2);
                list.Reverse(i1, list.Count() - i1);
                return true;
            }
            if (i == 0) {
                list.Reverse();
                return false;
            }
        }
    }

    public static IEnumerable<IList<T>> Permutations(this IEnumerable<T> items)
    {
        IList<T> local = items.ToList();
        local.Sort();
        yield local.ToList(); // clone the list as we modify it in NextPermutation
        while (NextPermutation(local))
        {
            yield local.ToList();
        }
    }
}

This doesn't require any substantial data structure, if you don't want to use one.

You can use nested loops to generate matches (i.e. triple for three matches, quadruple for for).

Skip any matches where an index is being reused, e.g. skip when an outer loop's index is the same as an inner loops index. That ensures the basic permutations.

The next requirement is to skip the duplicates caused by the duplicate elements in the input. This can be done lexically (or numerically). Track the last permutation accepted, and skip any new ones that have a lessor (or same) value.

Here's my sample in VBA/Excel:

Sub doit()
    Dim r As Range
    Set r = ActiveSheet.Cells
    n = 4
    s = "1123"
    Dim row As Integer
    Dim last As Integer
    Let last = 0
    Let row = 1
    For i = 1 To n
        ic = Mid(s, i, 1)
        For j = 1 To n
            If j <> i Then  'skip this index if it is in use by i
                jc = Mid(s, j, 1)
                For k = 1 To n
                    If k <> i And k <> j Then   'skip this index if it is in use by i or j
                        kc = Mid(s, k, 1)
                        For m = 1 To n
                            If m <> i And m <> j And m <> k Then
                                mc = Mid(s, m, 1)
                                v = ic & jc & kc & mc 'v is our candidate permutation
                                Dim x As Integer
                                Let x = Int(v)  'use value of string for greater than
                                If x > last Then
                                    Let r.Cells(row, 1).Value = v
                                    Let last = x
                                    Let row = row + 1 'move output to different row of worksheet
                                End If
                            End If
                        Next
                    End If
                Next
            End If
        Next
    Next
End Sub

(This version requires that the string you provide has its characters stored in increasing order.)

Another option to explore, maybe, is to use generative grammars, eg, for your (1, 1, 2, 2, 3) last example:

Word -> One One Two Two Three
      | One One Two Three Two
      | One One Three Two Two
      | ... (etc)

(where the alphabet also drives how many One's vs Two's vs Three's are allowed to appear on the right hand side)

One -> 1

Two -> 2

Three -> 3

Word being the grammar axiom, with the only non terminal productions - (where pipe "|" is the usual BNF symbol for alternatives) - and One, Two, Three being the only terminal productions of the respective 3 distinct classes of terminal symbols - 1, 2, 3.

Then the algorithm consists in,

1) preparing the One, Two, and Three set of productions (based on your alphabet)

2) preparing the Word set of production alternatives - ie, that "Word -> ... | ..." rule - based on known word length and on the permutations over the terminal ones that may appear on the right hand side (One, Two, Three), prior obtained in (1)

3) simply enumerating the Word productions obtained in (2) and applying them in sequence.

Your uniqueness requirement has thus basically been "reduced", so to speak, to that of the uniqueness of what (2), itself, can generate.

'Hth

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange