Crafting a LINQ based solution to determine if a set of predicates are satisfied for a pair of collections constrained by a set of invariants

StackOverflow https://stackoverflow.com/questions/16703082

Question

This isn't a question I feel I have the vocabulary to properly express, but I have two collections of the same anonymous type (lets call it 'a.)

'a is defined as new {string Name, int Count}

One of these collections of 'a we shall call requirements. One of these collections of 'a we shall call candidates.

Given these collections, I want to determine if the following assertions hold.

  1. If there exists some element in requirements r such that r.Count == 0, each element in candidates c such that r.Name == c.Name must satisfy c.Count == 0. There must exist one such element in candidates for each such element in requirements.

  2. For each element of requirements r where r.Count > 0, there must be some subset of elements in candidates c such that c₁.Name, c₂.Name, ..., cₓ.Name == r.Name and that c₁ + ... + cₓ >= r.Count. Each element of candidates used to satisfy this rule for some element in requirements may not be used for another element in requirements.

An example of this would be that given

requirements = {{"A",0}, {"B", 0}, {"C", 9}}
candidates = {{"B", 0},  {"C", 1}, {"A",0}, {"D", 2}, {"C", 4}, {"C", 4}}

That this query would be satisfied.

r={"A", 0} and r={"B", 0} would be satisfied according to rule #1 against c={"A", 0} and c={"B", 0}

-and-

r={"C", 9) is satisfied according to rule #2 by the group gc on collections c.Name derived from {{"C", 1}, {"C", 4}, {"C", 4}} as gc = {"C", 9}

However it is worth noting that if requirements contained {"C", 6} and {"C", 3} instead of {"C", 9}, this particular set of collections would fail to satisfy the predicates.

Now to the question finally. What is the best way to form this into a linq expression prioritizing speed (least iterations)?

The unsolved subset has been re-asked here

Was it helpful?

Solution 5

I finally came up with a workable solution

        IEnumerable<Glyph> requirements = t.Objectives.Cast<Glyph>().OrderBy(k => k.Name);
        IEnumerable<Glyph> candidates = Resources.Cast<Glyph>().OrderBy(k => k.Name);

        IEnumerable<Glyph> zeroCountCandidates = candidates.Where(c => c.Count == 0);
        IEnumerable<Glyph> zeroCountRequirements = requirements.Where(r => r.Count == 0);

        List<Glyph> remainingCandidates = zeroCountCandidates.ToList();

        if (zeroCountCandidates.Count() < zeroCountRequirements.Count())
        {
            return false;
        }

        foreach (var r in zeroCountRequirements)
        {
            if (!remainingCandidates.Contains(r))
            {
                return false;
            }
            else
            {
                remainingCandidates.Remove(r);
            }
        }

        IEnumerable<Glyph> nonZeroCountCandidates = candidates.Where(c => c.Count > 0);
        IEnumerable<Glyph> nonZeroCountRequirements = requirements.Where(r => r.Count > 0);

        var perms = nonZeroCountCandidates.Permutations();

        foreach (var perm in perms)
        {
            bool isViableSolution = true;

            remainingCandidates = perm.ToList();

            foreach (var requirement in nonZeroCountRequirements)
            {
                int countThreshold = requirement.Count;
                while (countThreshold > 0)
                {
                    if (remainingCandidates.Count() == 0)
                    {
                        isViableSolution = false;
                        break;
                    }

                    var c = remainingCandidates[0];
                    countThreshold -= c.Count;

                    remainingCandidates.Remove(c);
                }
            }

            if (isViableSolution)
            {
                return true;
            }
        }

        return false;

Disgusting isn't it?

OTHER TIPS

Here's my sketch for a linqy solution, but it doesn't address #3 at all. It works by grouping and joining on names. The hard part would then be to determine if there is some matching of requirements to candidates that satisfies the group.

void Main() {
    var requirements = new [] {
        new NameCount{ Name = "A", Count = 0 },
        new NameCount{ Name = "B", Count = 0 },
        new NameCount{ Name = "C", Count = 9 },
        new NameCount{ Name = "D", Count = 3 },
        new NameCount{ Name = "D", Count = 5 },
    };

    var candidates = new[] {
        new NameCount {Name = "B", Count = 0},
        new NameCount {Name = "C", Count = 1},
        new NameCount {Name = "A", Count = 0}, 
        new NameCount {Name = "D", Count = 2},
        new NameCount {Name = "C", Count = 4},
        new NameCount {Name = "C", Count = 4}
    };

    var matched = requirements
        .GroupBy(r => r.Name)
        .GroupJoin(candidates, rg => rg.Key, c => c.Name, 
            (rg, cg) => new { requirements = rg, candidates = cg });

    bool satisfied = matched.All( /* ??? */ );
}

struct NameCount {
    public string Name;
    public int Count;
}

For the given input, matched would be this: enter image description here

.GroupJoin has much better performance characteristics than candidates.Where in the projection.

After reconsidering the revised requirements, I've come up with a invariant assertions that must hold for a solution to exist..

For each paired cg and rg...

|cg.Name| >= |rg.Name|
cg.SummedCount >= rg.SummedCount

Assuming we have satisfied those conditions, a solution MAY exist.

My intuition suggests something similar to the following algorithm:

For each Name...

Let us call each r in rg a basket, and each c in cg an apple.

Sort apples in descending order.

We will keep track of which elements we've assigned to each basket in rg (e.g. r₁ is paired with cg₁.) Maintain sortedness in our buckets by ascending order of rₓ.Count - cgₓ.Count. (This value may be negative.)

Now, iterate through our list of apples, starting with the largest, and assign it to the least filled bucket by iterating through rg. If we overfill the first bucket, we continue descending through the list until we encounter a bucket that would remain unfilled if we put that apple in it. We then choose the previous bucket.

That is, we want to minimize the number of apples necessary to fill each bucket, so we prefer a perfect fit to overfilling, and overfilling to underfilling.

This algorithm does not work on the following case:
rg = (6, 5), cg = (3, 2, 2, 2, 2)

The above algorithm produces
r6 = (3, 2, 2), r5 = (2, 2)
whereas the solution ought to be
r6 = (2, 2, 2), r5 = (3, 2)

Going to post the obvious answer here, but I'm looking for something more elegant.

Given candidates as IEnumerable<'a>, project IEnumerable<'a> groupedCandidates from candidates by calling candidates.Where(c=>c.Count != 0).GroupBy(...) by performing a Sum on all elements with the same name.

Then project simpleCandidates from candidates.Except(groupedCandidates, (c,gc)=>c.Name == gc.Name)

Past here it gets fuzzy because candidates may only satisfy a requirement once.

EDIT: This solution does not meet the revised requirements.


I'm not familiar with LINQ, but it looks like you can do this problem in O(n) unless I misunderstand something. There are three steps to completing this problem.

First, construct a list or hashtable counter and populate it by iterating through c. If we use a hashtable, the size of the hashtable will be the length of c so we don't have to resize our hashtable.

for candidate in c:
    counter[candidate.name] += candidate.count

We do this in one pass. O(m) where m is the length of c.

With counter constructed, we construct a hashtable by iterating through r.

for requirement in r:
    if not h[requirement.name] or not requirement.count >= h[requirement.name]:
        h[requirement.name] = requirement.count

Then, we simply iterate through counter and compare counts.

for sum in counter:
    assert h[sum.name] and h[sum.name] >= sum.count

We do this in one pass: O(p) where p is the length of counter.

If this algorithm terminates successfully, our constraints are satisfied, and we've completed it in O(m) + O(o) + O(p)

algorithm:

if any requirement Name doesn't exist in the candidates, return false
for any requirement having Count = 0
    if there aren't at least as many candidates 
       with the same Name and Count, return false
eliminate all exact matches between candidates and requirements
eliminate requirements (and candidates) where the requirement 
    and all higher requirements have a higher candidate available
for remaining non-zero requirements
    find the subset of candidates
    that matches the most requirements
       and eliminate the requirements (and candidates)
if there are any remaining non-zero requirements
    return false
return true because no unmatched requirements remain

sample implementation:

public static bool IsValid(IEnumerable<string> requirementNames,
                           IList<int> requirementCounts,
                           IEnumerable<string> candidateNames,
                           IList<int> candidateCounts)
{
    var requirements = requirementNames
        .Select((x, i) => new
          {
              Name = x,
              Count = requirementCounts[i]
          })
        .ToList();
    var candidates = candidateNames
        .Select((x, i) => new
          {
              Name = x,
              Count = candidateCounts[i]
          })
        .ToList();

    var zeroRequirements = requirements
        .Where(x => x.Count == 0)
        .Select(x => x.Name)
        .GroupBy(x => x)
        .ToDictionary(x => x.Key, x => x.Count());
    var zeroCandidates = candidates
        .Where(x => x.Count == 0)
        .Select(x => x.Name)
        .GroupBy(x => x)
        .ToDictionary(x => x.Key, x => x.Count());
    if (zeroRequirements.Keys.Any(x => !zeroCandidates.ContainsKey(x) ||
                                       zeroCandidates[x] < zeroRequirements[x]))
    {
        return false;
    }

    var nonZeroRequirements = requirements
        .Where(x => x.Count != 0)
        .GroupBy(x => x.Name)
        .ToDictionary(x => x.Key,
                      x => x.Select(y => y.Count)
                               .GroupBy(y => y)
                               .ToDictionary(y => y.Key, y => y.Count()));
    var nonZeroCandidates = candidates
        .Where(x => x.Count != 0)
        .GroupBy(x => x.Name)
        .ToDictionary(x => x.Key,
                      x => x.Select(y => y.Count)
                               .GroupBy(y => y)
                               .ToDictionary(y => y.Key, y => y.Count()));

    foreach (var name in nonZeroRequirements.Keys.ToList())
    {
        var requirementsForName = nonZeroRequirements[name];
        Dictionary<int, int> candidatesForName;
        if (!nonZeroCandidates.TryGetValue(name, out candidatesForName))
        {
            return false;
        }
        if (candidatesForName.Sum(x => x.Value) <
            requirementsForName.Sum(x => x.Value))
        {
            return false;
        }
        if (candidatesForName.Sum(x => x.Value*x.Key) <
            requirementsForName.Sum(x => x.Value*x.Key))
        {
            return false;
        }

        EliminateExactMatches(candidatesForName, requirementsForName);
        EliminateHighRequirementsWithAvailableHigherCandidate(candidatesForName, requirementsForName);
        EliminateRequirementsThatHaveAMatchingCandidateSum(candidatesForName, requirementsForName);

        if (requirementsForName
            .Any(x => x.Value > 0))
        {
            return false;
        }
    }

    return true;
}

private static void EliminateRequirementsThatHaveAMatchingCandidateSum(
    IDictionary<int, int> candidatesForName,
    IDictionary<int, int> requirementsForName)
{
    var requirements = requirementsForName
        .Where(x => x.Value > 0)
        .OrderByDescending(x => x.Key)
        .SelectMany(x => Enumerable.Repeat(x.Key, x.Value))
        .ToList();
    if (!requirements.Any())
    {
        return;
    }

    // requirements -> candidates used
    var items = GenerateCandidateSetsThatSumToOrOverflow(
        requirements.First(),
        candidatesForName,
        new List<int>())
        .Concat(new[] {new KeyValuePair<int, IList<int>>(0, new List<int>())})
        .Select(x => new KeyValuePair<IList<int>, IList<int>>(
                         new[] {x.Key}, x.Value));

    foreach (var count in requirements.Skip(1))
    {
        var count1 = count;
        items = (from i in items
                 from o in GenerateCandidateSetsThatSumToOrOverflow(
                     count1,
                     candidatesForName,
                     i.Value)
                 select
                     new KeyValuePair<IList<int>, IList<int>>(
                     i.Key.Concat(new[] {o.Key}).OrderBy(x => x).ToList(),
                     i.Value.Concat(o.Value).OrderBy(x => x).ToList()))
            .GroupBy(
                x => String.Join(",", x.Key.Select(y => y.ToString()).ToArray()) + ">"
                     + String.Join(",", x.Value.Select(y => y.ToString()).ToArray()))
            .Select(x => x.First());
    }

    var bestSet = items
        .OrderByDescending(x => x.Key.Count(y => y > 0)) // match the most requirements
        .ThenByDescending(x => x.Value.Count) // use the most candidates
        .ToList();
    var best = bestSet.First();

    foreach (var requirementCount in best.Key.Where(x => x > 0))
    {
        requirementsForName[requirementCount] -= 1;
    }

    foreach (var candidateCount in best.Value.Where(x => x > 0))
    {
        candidatesForName[candidateCount] -= 1;
    }
}

private static void EliminateHighRequirementsWithAvailableHigherCandidate(
    IDictionary<int, int> candidatesForName,
    IDictionary<int, int> requirementsForName)
{
    foreach (var count in requirementsForName
        .Where(x => x.Value > 0)
        .OrderByDescending(x => x.Key)
        .Select(x => x.Key)
        .ToList())
    {
        while (requirementsForName[count] > 0)
        {
            var count1 = count;
            var largerCandidates = candidatesForName
                .Where(x => x.Key > count1)
                .OrderByDescending(x => x.Key)
                .ToList();
            if (!largerCandidates.Any())
            {
                return;
            }

            var largerCount = largerCandidates.First().Key;
            requirementsForName[count] -= 1;
            candidatesForName[largerCount] -= 1;
        }
    }
}

private static void EliminateExactMatches(
    IDictionary<int, int> candidatesForName,
    IDictionary<int, int> requirementsForName)
{
    foreach (var count in requirementsForName.Keys.ToList())
    {
        int numberOfCount;
        if (candidatesForName.TryGetValue(count, out numberOfCount) &&
            numberOfCount > 0)
        {
            var toRemove = Math.Min(numberOfCount, requirementsForName[count]);
            requirementsForName[count] -= toRemove;
            candidatesForName[count] -= toRemove;
        }
    }
}

private static IEnumerable<KeyValuePair<int, IList<int>>> GenerateCandidateSetsThatSumToOrOverflow(
    int sumNeeded,
    IEnumerable<KeyValuePair<int, int>> candidates,
    IEnumerable<int> usedCandidates)
{
    var usedCandidateLookup = usedCandidates
        .GroupBy(x => x)
        .ToDictionary(x => x.Key, x => x.Count());
    var countToIndex = candidates
        .Select(x => Enumerable.Range(
            0,
            usedCandidateLookup.ContainsKey(x.Key)
                ? x.Value - usedCandidateLookup[x.Key]
                : x.Value)
                         .Select(i => new KeyValuePair<int, int>(x.Key, i)))
        .SelectMany(x => x)
        .ToList();

    // sum to List of <count,index>
    var sumToElements = countToIndex
        .Select(a => new KeyValuePair<int, IList<KeyValuePair<int, int>>>(
                         a.Key, new[] {a}))
        .ToList();

    countToIndex = countToIndex.Where(x => x.Key < sumNeeded).ToList();

    while (sumToElements.Any())
    {
        foreach (var set in sumToElements
            .Where(x => x.Key >= sumNeeded))
        {
            yield return new KeyValuePair<int, IList<int>>(
                sumNeeded,
                set.Value.Select(x => x.Key).ToList());
        }

        sumToElements = (from a in sumToElements.Where(x => x.Key < sumNeeded)
                         from b in countToIndex
                         where !a.Value.Any(x => x.Key == b.Key && x.Value == b.Value)
                         select new KeyValuePair<int, IList<KeyValuePair<int, int>>>(
                             a.Key + b.Key,
                             a.Value.Concat(new[] {b})
                                 .OrderBy(x => x.Key)
                                 .ThenBy(x => x.Value)
                                 .ToList()))
            .GroupBy(x => String.Join(",", x.Value.Select(y => y.Key.ToString()).ToArray()))
            .Select(x => x.First())
            .ToList();
    }
}


private static IEnumerable<int> GetAddendsFor(int sum, Random random)
{
    var values = new List<int>();
    while (sum > 0)
    {
        var addend = random.Next(1, sum);
        sum -= addend;
        values.Add(addend);
    }
    return values;
}

Tests:

[Test]
public void ABCC_0063__with_candidates__BCADCC_010244__should_return_false()
{
    var requirementNames = "ABCC".Select(x => x.ToString()).ToArray();
    var requirementCounts = new[] {0, 0, 6, 3};

    var candidateNames = "BCADCC".Select(x => x.ToString()).ToArray();
    var candidateCounts = new[] {0, 1, 0, 2, 4, 4};

    var actual = IsValid(requirementNames, requirementCounts, candidateNames, candidateCounts);
    actual.ShouldBeFalse();
}

[Test]
public void ABC_003__with_candidates__BCADCC_010244__should_return_true()
{
    var requirementNames = "ABC".Select(x => x.ToString()).ToArray();
    var requirementCounts = new[] {0, 0, 3};

    var candidateNames = "BCADCC".Select(x => x.ToString()).ToArray();
    var candidateCounts = new[] {0, 1, 0, 2, 4, 4};

    var actual = IsValid(requirementNames, requirementCounts, candidateNames, candidateCounts);
    actual.ShouldBeTrue();
}

[Test]
public void ABC_003__with_candidates__BCAD_0102__should_return_false()
{
    var requirementNames = "ABC".Select(x => x.ToString()).ToArray();
    var requirementCounts = new[] {0, 0, 3};

    var candidateNames = "BCAD".Select(x => x.ToString()).ToArray();
    var candidateCounts = new[] {0, 1, 0, 2};

    var actual = IsValid(requirementNames, requirementCounts, candidateNames, candidateCounts);
    actual.ShouldBeFalse();
}

[Test]
public void ABC_009__with_candidates__BCADCC_010244__should_return_true()
{
    var requirementNames = "ABC".Select(x => x.ToString()).ToArray();
    var requirementCounts = new[] {0, 0, 9};

    var candidateNames = "BCADCC".Select(x => x.ToString()).ToArray();
    var candidateCounts = new[] {0, 1, 0, 2, 4, 4};

    var actual = IsValid(requirementNames, requirementCounts, candidateNames, candidateCounts);
    actual.ShouldBeTrue();
}

[Test]
public void FuzzTestIt()
{
    var random = new Random();
    const string names = "ABCDE";

    for (var tries = 0; tries < 10000000; tries++)
    {
        var numberOfRequirements = random.Next(5);
        var shouldPass = true;
        var requirementNames = new List<string>();
        var requirementCounts = new List<int>();
        var candidateNames = new List<string>();
        var candidateCounts = new List<int>();
        for (var i = 0; i < numberOfRequirements; i++)
        {
            var name = names.Substring(random.Next(names.Length), 1);
            switch (random.Next(6))
            {
                case 0: // zero-requirement with corresponding candidate
                    requirementNames.Add(name);
                    requirementCounts.Add(0);
                    candidateNames.Add(name);
                    candidateCounts.Add(0);
                    break;
                case 1: // zero-requirement without corresponding candidate
                    requirementNames.Add(name);
                    requirementCounts.Add(0);
                    shouldPass = false;
                    break;
                case 2: // non-zero-requirement with corresponding candidate
                    {
                        var count = random.Next(1, 10);
                        requirementNames.Add(name);
                        requirementCounts.Add(count);
                        candidateNames.Add(name);
                        candidateCounts.Add(count);
                    }
                    break;
                case 3: // non-zero-requirement with matching sum of candidates
                    {
                        var count = random.Next(1, 10);
                        requirementNames.Add(name);
                        requirementCounts.Add(count);
                        foreach (var value in GetAddendsFor(count, random))
                        {
                            candidateNames.Add(name);
                            candidateCounts.Add(value);
                        }
                    }
                    break;
                case 4: // non-zero-requirement with matching overflow candidate
                    {
                        var count = random.Next(1, 10);
                        requirementNames.Add(name);
                        requirementCounts.Add(count);
                        candidateNames.Add(name);
                        candidateCounts.Add(count + 2);
                    }
                    break;
                case 5: // non-zero-requirement without matching candidate or sum or candidates
                    {
                        var count = random.Next(10, 20);
                        requirementNames.Add(name);
                        requirementCounts.Add(count);
                        shouldPass = false;
                    }
                    break;
            }
        }

        try
        {
            var actual = IsValid(requirementNames, requirementCounts, candidateNames, candidateCounts);
            actual.ShouldBeEqualTo(shouldPass);
        }
        catch (Exception e)
        {
            Console.WriteLine("Requirements: " + String.Join(", ", requirementNames.ToArray()));
            Console.WriteLine("              " +
                              String.Join(", ", requirementCounts.Select(x => x.ToString()).ToArray()));
            Console.WriteLine("Candidates:   " + String.Join(", ", candidateNames.ToArray()));
            Console.WriteLine("              " +
                              String.Join(", ", candidateCounts.Select(x => x.ToString()).ToArray()));
            Console.WriteLine(e);
            Assert.Fail();
        }
    }
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top