Question

I often find the scenario where I return an IEnumerable using the yield return statement, then have other methods which call this function with different parameters and directly return the result.

Is there any performance benefit to iterating through the results end yield returning these as opposed to simply returning the collection? i.e. if I use return on the resulting IEnumberable rather than looping through these results again and using yield return does the compiler know to only generate the results as they're required, or does it wait for the entire collection to be returned before returning all results?

public class Demo
{
    private IEnumerable<int> GetNumbers(int x)
    {
        //imagine this operation were more expensive than this demo version
        //e.g. calling a web service on each iteration.
        for (int i = 0; i < x; i++)
        {
            yield return i;
        }
    }
    //does this wait for the full list before returning
    public IEnumerable<int> GetNumbersWrapped()
    {
        return GetNumbers(10);
    }
    //whilst this allows the benefits of Yield Return to persist?
    public IEnumerable<int> GetNumbersWrappedYield()
    {
        foreach (int i in GetNumbers(10))
            yield return i;
    }
}
Was it helpful?

Solution

GetNumbersWrapped simply passes through the original enumerable - it hasn't even invoked the iterator at that point. The end result remains fully deferred / lazy / spooling / whatever else.

GetNumbersWrappedYield adds an extra layer of abstraction - so now, every call to MoveNext has to do ever so slightly more work; not enough to cause pain. It too will be fully deferred / lazy / spooling / whatever else - but adds some minor overheads. Usually these minor overheads are justified by the fact that you are adding some value such as filtering or additional processing; but not really in this example.

Note: one thing that GetNumbersWrappedYield does do is prevent abuse by callers. For example, if GetNumbers was:

private IEnumerable<int> GetNumbers(int x)
{
    return myPrivateSecretList;
}

then callers of GetNumbersWrapped can abuse this:

IList list = (IList)GetNumbers(4);
list.Clear();
list.Add(123); // mwahahaahahahah

However, callers of GetNumbersWrappedYield cannot do this... at least, not quite as easily. They could, of course, still use reflection to pull the iterator apart, obtain the wrapped inner reference, and cast that.

This, however, is not usually a genuine concern.

OTHER TIPS

A function that returns an IEnumberable<T> is indeed returning an object, which is not the full list but knows how to iterate it when its Enumerator's MoveNext method is called. The same is true for your GetNumbersWrapped method above. It doesn't wait for the full collection. It returns an object which has an Enumerator inside. Whenever the foreach loop (or other loop operations) call this enumerator's MoveNext method it starts reading the values. So GetNumbersWrapped and GetNumbersWrappedYield are the same, except that GetNumbersWrappedYield has one redundant looping layer.

If I correctly understand your question, the function GetNumbersWrapped does not wait for the full list before returning because it's return type is an IEnumerable and the execution of the inner loop inside GetNumbers is deferred.

As you probably know, IEnumerable<T> is evaluated lazily. This means that all three methods will effectively do exactly the same, but the GetNumbersWrappedYield will just be wrapped in a redundant enumerator. There won't be any performance benefit by doing this, in fact, the extra enumerator will introduce a slight overhead. I'd stick with the GetNumbers directly.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top