Compare an ICollection members with itself

https://stackoverflow.com/questions/8888002

16-04-2021
|

Question

Is there any cheapest way to compare an ICollection with itself.

Here is my code:

        public IEnumerable<Pet> speciesChecker()
        {
            foreach (Pet pet in _pets)
            {
                bool wantedSpecies = true;
                foreach (Pet pet2 in _pets)
                {
                    if (pet2 != pet && pet.Species == pet2.Species)
                    {
                        wantedSpecies = false;
                        break;
                    }
                }
                if (wantedSpecies) yield return pet;
            }
        }

What is the time complexity of my code, all I know is this that it is less than O(N^2) and if I'll remove 'break' from inner foreach loop, the time complexity will be O(N^2). Please correct me if I am wrong.

Solution

let n is the length of _pets collection

number of required steps with break:

1+2+3+...+n = n*(n+1)/2 =n^2/2 + n/2 = O(n^2) (for each pet in _pets);

There are two simple rules how to calculate O from wiki:

If f(x) is a sum of several terms, the one with the largest growth rate is kept, and all others omitted.
If f(x) is a product of several factors, any constants (terms in the product that do not depend on x) are omitted.

number of required steps without break:

n+n+n+...+n = n^2 = O(n^2)

OTHER TIPS

Here is my take on it:

var q = list.GroupBy (l => l.Species)
          .Where (l => l.ElementAtOrDefault(1) == null)
          .Select (l => l.Key)

GroupBy will use HashSet internally so O(N)
ElementAtOrDefault(1) will only need to move the enumerator one step so will not be O(n)

I think that this code does the same thing. In that case, this is an O(N) algorithm. The trick is to store the pets in a dictionary indexed by species.

    public IEnumerable<Pet> speciesChecker()
    {
        var species = new Dictionary<Species, List<Pet>>();
        foreach (Pet pet in _pets)
        {
            // create the list if it doesn't exist
            if (!species.ContainsKey(pet.Species))
                species[pet.Species] = new List<Pet>();
            species[pet.Species].Add(pet);
        }

        // foreach species, if there is only one pet of that species, then return it
        foreach (var speciesPets in species.Values)
        {
            if (speciesPets.Count() == 1)
                yield return speciesPets.First();
        }

        yield break;
    }

You can also use something like the following, which should also be O(N):

public IEnumerable<Pet> speciesChecker ()
{
    _pets.GroupBy (p => p.Species)
         .Select (g => new List<Pet> (g))
         .Where (l => l.Count == 1)
         .SelectMany (l => l);
}

The extra Select (g => new List<Pet> (g)) may be superfluous, but I believe that will help avoid iterating the whole grouping logic a second time, which I believe would result in O(N^2) .

Edit: Good comment from Magnus about the List constructor operating in O(n) defeating the purpose...

How about:

public IEnumerable<Pet> speciesChecker ()
{
    var groups = _pets.GroupBy (p => p.Species);

    foreach (var grp in _pets.GroupBy (p => p.Species))
    using (var e = grp.GetEnumerator ()) {
        if (!e.MoveNext ())
            continue;

        var first = e.Current;

        if (e.MoveNext ())
            continue;

        yield return first;
    }
}

I think this is as optimized as you can get, and will work in O(n). We avoid using the IEnumerable<T>.Any () or IEnumerable<T>.Count () extension method as well.

Thoughts?

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow