Question

UPDATE: @bphelpsjr answer provides what I am looking for. Unfortunately someone down-voted him and I do not have the rep to up-vote. I am marking his response as the answer.

This is extremely long winded but I wanted to provide as much detail as possible.

Essentially, I want to take a set of data and generate a list of lists based on rules (defined below). This is essentially a filtered version of a powerset.

I will then store these results for repeated use (similar to that of a rainbow table) to avoid constant calculation of the same N. I will then use variable substitution (e.g., A = 18, B = 30) before applying other logic (not described below, not necessary for my question).

Here are two input options I've experimented with in attempting to create a solution. You could also use numbers instead of letters.

Input Option #1

var completeList = new List<Item>
        {
            new Item('A', 'A'),
            new Item('A', 'B'),
            new Item('A', 'C'),
            new Item('A', 'D'),            
            new Item('B', 'B'),
            new Item('B', 'C'),
            new Item('B', 'D'),           
            new Item('C', 'C'),
            new Item('C', 'D'),            
            new Item('D', 'D')
        };

Input Option #2

List<Item> aList = new List<Item> 
{
        new Item('A', 'A'),
        new Item('A', 'B'),
        new Item('A', 'C'),
        new Item('A', 'D'),            
    };

    List<Item> bList = new List<Item> 
    {
        new Item('B', 'B'),
        new Item('B', 'C'),
        new Item('B', 'D'),           
    };

    List<Item> cList = new List<Item> 
    {
        new Item('C', 'C'),
        new Item('C', 'D'),            
    };

    List<Item> dList = new List<Item> 
    {
        new Item('D', 'D')
    };

Desired Output

AA BB CC DD
AA BB CD
AA BC    DD
AA BD CC
AB    CC DD 
AB    CD
AC BB    DD
AC BD
AD BB CC
AD BC

Rules

The first 3 are definitive rules while the 4th is more a desire.

  1. Solution must be able to handle N number of distinct letters and lists of items

  2. Every distinct letter must appear at least once in the list of items. Example:

    AA BB CC DD <-- Valid

    AA BB CC <-- invalid, does not contain D

  3. Letters may only repeat within a given item. Example:

    AA BB CC DD <-- valid

    AA BA CC DD <-- invalid, A is repeated in a different item

  4. The logic must contain as much "aggressive filtering" and short circuiting as possible in order to cut down on the number of iterations that it will perform. I had a working left-shift solution but it does not scale whatsoever due to the (my?) inability to incorporate the filtering and short circuiting. This basically resulted in iterating through the entire powerset.

    • Example: Once a letter is found that is already contained within a potential list's items, move on to the next potential combination because this one is invalid.

    • Example: Once a valid list of items has been found, start the next round.

The next two are potential examples based on the way I currently have the data set grouped by the first letter of each item. They may not be applicable depending on what type of solution you're creating.

  • Potential Example: If an item contains a letter that is in the next list's items, skip that entire list and move to the next list of items.

    AA BC DD <-- We can skip the C list because it is covered by BC

  • Potential Example: Once you have exhausted a list's potential candidates (e.g., the last list will only ever have 1 item), you shouldn't (if my thinking is correct) need that list again until the list above it + 1 has changed items.

    AA BB CC DD <-- after you find this, stop searching the list containing DD until you get to BC (list above DD + 1)

    AA BB CD

    AA BC DD <-- We need DD again

    1. No list of items should repeat itself, regardless of the order of items. Example:

    AA BB CC DD == BB AA DD CC so do not include BB AA DD CC

Assumptions I've made:

  • It will be easier to group the sets by their initial starting letter (see sample data below). If this is not the optimal approach, it is not an issue.

Below is the Item class I use to hold my data simply for convenience:

public class Item
{
    public char First { get; set; }
    public char Second { get; set; }

    public Item(char first, char second)
    {
        First = first;
        Second = second;

    public List<char> DistinctCharacters()
    {
        return First == Second ? new List<char> { First } : new List<char> { First,  Second };
    }
}
Was it helpful?

Solution

This should work (using numbers instead of letters):

    private static BlockingCollection<List<int[]>> GetCombinations(int toConsider)
    {
        var allResults = new BlockingCollection<List<int[]>>();
        var possibilities = Enumerable.Range(0, toConsider).ToList();
        Parallel.ForEach(possibilities, possibility =>
        {
            GetIteratively(new List<int[]> { new[] { 0, possibility } }, allResults, possibilities.RemoveAllClone(x => x == 0 || x == possibility));
        });
        return allResults;
    }
    public static void GetIteratively(List<int[]> result, BlockingCollection<List<int[]>> allResults, List<int> possibilities)
    {
        Stack<Tuple<List<int[]>, List<int>>> stack = new Stack<Tuple<List<int[]>, List<int>>>();
        stack.Push(new Tuple<List<int[]>,List<int>>(result, possibilities));
        while (stack.Count > 0)
        {
            var pop = stack.Pop();
            if (pop.Item2.Count > 0)
                pop.Item2.ForEach(x => stack.Push(new Tuple<List<int[]>, List<int>>(new List<int[]>(result) { new int[] { pop.Item2[0], x } }, pop.Item2.RemoveAllClone(y => (y == pop.Item2[0] || y == x)))));
            else
                allResults.Add(result);
        }   
    }

And here is the LinqExtension for RemoveAllClone

    public static List<T> RemoveAllClone<T>(this IEnumerable<T> source, Predicate<T> match)
    {
        var clone = new List<T>(source);
        clone.RemoveAll(match);
        return clone;
    }

OTHER TIPS

I do not have enough rep to comment so I am posting an incomplete answer. I have a solution but havent refined it. It currently spits out incomplete combinations (eg AD CC) and could use some pruning to avoid looking at useless lists.

My approach is recursive, but avoids some computations by storing solutions. For example, the combinations remaining when looking at the C list, having used the A and B letters are the same whether the combination so far is AA BB or AB.

I have not implemented the Memorize() and IKnowThis() methods but they should be straightforward using hashtables.

foreach (var combo in GenerateCombinations("", 0))   
{
    Console.WriteLine(combo);
}

private static List<string> GenerateCombinations(string used, int listIndex)
    {
        if (listIndex >= _allLists.Count || used.Length == _allLists.Count)
            return new List<string>();

        List<string> combos;

        if (!IKnowThis(used, listIndex, out combos))
        {
            if (used.Contains(_allLists[listIndex][0].First))
                return GenerateCombinations(used, listIndex + 1);

            combos = new List<string>();

            foreach (var item in _allLists[listIndex])
            {
                var newcombos = new List<string>();



                string newUsed = Combine(used, item);
                newcombos.AddRange(GenerateCombinations(newUsed, listIndex + 1));

                if (!used.Contains(item.Second) && !used.Contains(item.First))
                {
                    if (newcombos.Count == 0)
                    {
                        newcombos.Add(item.ToString());
                    }
                    else
                    {
                        for (int i = 0; i < newcombos.Count; i++)
                        {
                            newcombos[i] = item + " " + newcombos[i];
                        }
                    }
                }

                combos.AddRange(newcombos);
            }
        }

        Memorize(used, combos);
        return combos;
    }

    private static string Combine(string used, Item item)
    {
        if (!used.Contains(item.First))
            used += item.First;
        if (!used.Contains(item.Second))
            used += item.Second;

        return used;
    }        
}

public class Item
{
    public char First { get; set; }
    public char Second { get; set; }

    public Item(char first, char second)
    {
        First = first;
        Second = second;
    }
    public string DistinctCharacters()
    {
        return First == Second ? First.ToString() : this.ToString();
    }

    public override string ToString()
    {
        return First.ToString() + Second;
    }
}

Does this work to give you what you want?

If I start with your completeList plus the missing backwards transitions:

var completeList = new List<Item>
{
    new Item('A', 'A'),
    new Item('A', 'B'),
    new Item('A', 'C'),
    new Item('A', 'D'),
    new Item('B', 'B'),
    new Item('B', 'C'),
    new Item('B', 'D'),
    new Item('C', 'B'),
    new Item('C', 'C'),
    new Item('C', 'D'),
    new Item('D', 'B'),
    new Item('D', 'C'),
    new Item('D', 'D'),
};

Then I can do this:

var lookup = completeList.ToLookup(x => x.First, x => x.Second);

Func<IEnumerable<string>, IEnumerable<string>> f = null;
f = xs =>
{
    var query =
        from x in xs
        let ys = lookup[x.Last()]
            .Where(y => !x
                .Take(x.Length % 2 == 1 ? x.Length - 1 : x.Length)
                .Contains(y))
            .Select(y => x + y)
            .ToArray()
        group new { x, ys } by ys.Any();

    return query
        .Where(c => c.Key == false)
        .SelectMany(qs => qs.Select(q => q.x))
        .Concat(query
            .Where(c => c.Key == true)
            .SelectMany(ys => Generate(ys.SelectMany(y => y.ys))));
};

var results = f(new [] { "A" });

I get these results:

ABCD 
ABDC 
ACBD 
ACDB 
ADBC 
ADCB 
AABBCD 
AABBDC 
AABCDD 
AABDCC 
AACBDD 
AACCBD 
AACCDB 
AACDBB 
AADBCC 
AADCBB 
AADDBC 
AADDCB 
ABCCDD 
ABDDCC 
ACBBDD 
ACDDBB 
ADBBCC 
ADCCBB 
AABBCCDD 
AABBDDCC 
AACCBBDD 
AACCDDBB 
AADDBBCC 
AADDCCBB 
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top