Question

What's the accepted practice on forcing evaluation of LINQ queries with methods like ToArray() and are there general heuristics for composing optimal chains of queries? I often try to do everything in a single pass because I've noticed in those instances that AsParallel() does a really good job in speeding up the computation. In cases where the queries perform computations with no side-effects but several passes are required to get the right data out is forcing the computation with ToArray() the right way to go or is it better to leave the query in lazy form?

Was it helpful?

Solution

Keep the queries in lazy form until you start to evaluate the query multiple times, or even earlier if you need them in another form or you are in danger of variables captured in closures changing their values.

You may want to evaluate when the query contains complex projections which you want to avoid performing multiple times (e.g. constructing complex objects for sequences with lots of elements). In this case evaluating once and iterating many times is much saner.

You may need the results in another form if you want to return them or pass them to another API that expects a specific type of collection.

You may want or need to prevent accessing modified closures if the query captures variables which are not local in scope. Until the query is actually evaluated, you are in danger of other code changing their values "behind your back"; when the evaluation happens, it will use these values instead of those present when the query was constructed. (However, this can be worked around by making a copy of those values in another variable that does have local scope).

OTHER TIPS

If you are not averse to using an 'experimental' library, you could use the EnumerableEx.Memoize extension method from the Interactive Extensions library.

This method provides a best-of-both-worlds option where the underlying sequence is computed on-demand, but is not re-computed on subequent passes. Another small benefit, in my opinion, is that the return type is not a mutable collection, as it would be with ToArray or ToList.

You would normally only use ToArray() when you need to use an array, like with an API that expects an array. As long as you don't need to access the results of a query, and you're not confined to some kind of connection context (like the case may be in LINQ to SQL or LINQ to Entities), then you might as well just keep the query in lazy form.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top