Force Linq to not delay execution

https://stackoverflow.com/questions/1064043

21-08-2019
|

Question

In fact, this is the same question as this post:

How can I make sure my LINQ queries execute when called in my DAL, not in a delayed fashion?

But since he didn't explain why he wanted it, the question seems to have been passed over a bit. Here's my similar-but-better-explained problem:

I have a handful of threads in two types (ignoring UI threads for a moment). There's a "data-gathering" thread type, and a "computation" thread type. The data gathering threads are slow. There's a quite a bit of data to be sifted through from a variety of places. The computation threads are comparatively fast. The design model up to this point is to send data-gathering threads off to find data, and when they're complete pass the data up for computation.

When I coded my data gathering in Linq I wound up hoisting some of that slowness back into my computation threads. There are now data elements that aren't getting resolved completely until they're used during computation -- and that's a problem.

I'd like to force Linq to finish its work at a given time (end of statement? end of method? "please finish up, dammit" method call) so that I know I'm not paying for it later on. Adding ".ToList()" to the end of the Linq is 1. awkward, and 2. feels like boxing something that's about to be unboxed in another thread momentarily anyway.

Solution

You wouldn't be boxing anything - you'd be buffering the results.

Using ToList() is basically the way to go if you actually want the data. Unless you're ready to use the data immediately, it's got to be buffered somewhere, hasn't it? A list is just a convenient way to do that.

The alternative is to do the processing then and there as well - use the data as you produce it, eagerly. I didn't quite follow the different threads side of thing, so it's not clear to me whether that would help you, but those are basically the choices available to you as far as I can see.

This is actually somewhat explicit in your description:

The design model up to this point is to send data-gathering threads off to find data, and when they're complete pass the data up for computation.

Calling ToList() basically changes what you return from "a query which can fetch the data when asked to" to "the data itself, buffered in a list".

OTHER TIPS

Can you explain more why .ToList is not acceptable? You mentioned boxing and unboxing but those are completely unrelated topics.

Part of forcing a LINQ query to complete on demand necessitates storing the results. Otherwise in order to see the results again, you'd have to repprocess the query. .ToList efficiently achieves this by storing the elements in a List<T>.

It's possible to store the elements in virtually any other collection style data structure with various trade offs that may suit your needs better.

There is a LoadOptions property in the DataContext class that could help you fetch the data more eagerly.

Else you could use a few clever placed ToList() 's.

I know this thread is old... anyway, funny no-one mentioned .ToLast() yet. I'm doing something where linq is not much more than a glorified foreach driving some side effects where I don't really care about the query result... so I didn't want to allocate any more bogus memory than necessary.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow