Question

I have a large IEnumerable of EntityObjects and a large IEnumerable of strings, which are a key of the objects.

I want to obtain a new list of only the objects where the key is matched. At the moment I am doing this through Contains() - but it seems pretty slow?

class Foo {
  string Key
  string Prop1
  int Prop2
  decimal Prop3
  Bar Prop4
  Thing Prop5
  Stuff Prop6
  ...more properties
}

IEnumerable<Foo> foos
IEnumerable<string> fooKeys

var matchedFoos = foos.Where(f => fooKeys.Contains(f.Key));

This works and returns what I expect, but seems to be slow and I think there must be a better way? I've seen a few posts on Intersect, but seems to be for enumerables of the same type?

For info:

  • foos.Count() approx 164,000
  • fooKeys.Count() approx 75,000
Was it helpful?

Solution

  1. You should probably do the search on database (using LINQ to Entities), not on application (using LINQ to Objects).

  2. You can change fooKeys to HashSet<string> (if it's not one already) to make Contains() method call O(1) instead of O(n):

    var keesSet = new HashSet<string>(fooKeys);
    var matchedFoos = foos.Where(f => keesSet.Contains(f.Key));
    

    But with collections that big it will still require a fair amount of time to perform the search.

OTHER TIPS

I think yet another variant with join clause like this

IEnumerable<Foo> foos
IEnumerable<string> fooKeys

var matchedFoos = from foo in foos
                  join fk in fooKeys on foo.Key equals fk
                  select foo;
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top