Question

Based on the following code:

 var grouped = filters.GroupBy(p => p.PropertyName);
                 int numOfRowElements = grouped.Count();
     foreach (IGrouping<string, PropertyFilter> filter in grouped)
                {


                    foreach (var propertyFilter in filter)
                    {
                        // do something
                    }

                }

where filters a List, my understanding is that calling IEnumerable.Count() forces the query to be executed. Is the result of this execution stored in the grouped variable, which is then used in the foreach loop, or does the foreach loop force the query to be executed again? Would it be better to do this instead?

 var grouped = filters.GroupBy(p => p.PropertyName).ToList();
  int numOfRowElements = grouped.Count;
     foreach (IGrouping<string, PropertyFilter> filter in grouped)
                {


                    foreach (var propertyFilter in filter)
                    {
                        // do something
                    }

                }

TIA.

Was it helpful?

Solution

If the underlying datasource is IList<T>, Enumerable.Count() will invoke the .Count property as an optimization, so there is no* performance penalty. If it is not, an enumeration will be forced. Consider this carefully.

var someList = new List<int>(); 
var count = someList.Count(); // will use .Count property
var count = someList.OrderBy(x => x).Count(); // will force enumeration 

In this example, I'm just getting the count of the list in the second statement. In the third, I'm ordering the list and then getting the count. Ordering the list returns a sequence, not a list. Therefore, the Count() method is not working on an IList<T>, but an IEnumerable<T>. In this case, the query must be enumerated to acquire the result and will incur whatever cost that comes along with it (in this case, the ordering).

In light of this, in your first snippet, you will enumerate your query twice. Once to get the count, once in the foreach. This will perform all the logic to group your data twice. Your second example will perform the grouping operations just once, while obviously iterating over the resulting list in the foreach, which should be less expensive than also performing the grouping operation a second time. (Whether you can measure a savings will entirely depend upon the size and/or source of the data in the original list. When in doubt, profile it.)


*There may be a small measured penalty for the layer of indirection, you'll have to profile this if you believe it is a true bottleneck. But think of the Count() method as

if (sequence is IList<T>) 
{
    return ((IList<T>)sequence).Count
}
else 
{
   /* perform enumeration */;
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top