Question

Recently, I ran into a strange problem where I had a method generate an IEnumerable collection of objects. This method contained four yield return statements that returned four objects. I assigned the result to a variable results using the var keyword.

var result = GenerateCollection().ToList();

This effectively meant: List<MyType> result = GenerateCollection().

I made a simple for loop over the elements of this collection. What surprised me is that the collection was reenumerated for each call to the list (for each result[i]). Later I used the result collection in a LINQ query which had some bad results performance-wise due to the continual reenumeration of the collection.

I solved the problem by casting to array instead of the list.

What this makes me wonder now is when are the collections enumerated? Which method calls make collections reenumerate?

EDIT: The GenerateCollection() method looked similarly to this:

public static IEnumerable<MyType> GenerateCollection()
{
    var array = data.AsParallel(); //data is a simple collection of sublists of strings
    yield return new MyType("a", array.Where(x => x.Sublist.Count(y => y == 'a') == 0));
    yield return new MyType("b", array.Where(x => x.Sublist.Count(y => y == 'b') == 0));
    yield return new MyType("c", array.Where(x => x.Sublist.Count(y => y == 'c') == 0));
    yield return new MyType("d", array.Where(x => x.Sublist.Count(y => y == 'd') == 0));
}
Was it helpful?

Solution

You are yielding objects which have queries inside - it's not some sequence of array values - its iterator objects which are not executed when you are passing them to constructor of MyType. When you create list of MyType objects

var result = GenerateCollection().ToList();

all MyType instances are yielded and saved into list, but if you haven't executed iterators in MyType constructor, then queries are not executed. And even more - they will be executed each time again, if you'll call some operator which executes query, e.g.

result[i].ArrayIterator.Count(); // first execution
foreach(var item in result[i].ArrayIterator) // second execution
    // ...

You can fix it if you'll pass result of query execution to MyType constructor:

yield return new MyType("a", array.Where(x => !x.Sublist.Contains('a')).ToList())

Now you are passing list of items instead of iterator (you can use ToArray()) also. Query is executed when you are yielding MyType instance, and it will not be executed again.

OTHER TIPS

array.Where(x => x.Sublist.Count(y => y == 'a') == 0)

This piece of code will be enumerated every time you access it in MyType. Use ToList or ToArray to ensure it is enumerated only once in place where the code is written.

collections which are based on deferred execution gets enumerated as soon as you use them For-example IEnumerable,IQueryable etc. and collections which are based on immediate execution gets enumerated as soon as they are created for example LIST.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top