Changing a method that has “return” and “yield return”

https://stackoverflow.com/questions/8681916

06-05-2021
|

Question

I know it's impossible to use return and yield return in the same method.

This is the code that I would like to optimize:

public IEnumerable<TItem> GetItems(int data)
{
    if (this.isSingleSet)
    {
        return this.singleSet; // is IEnumerable per-se
    }
    else
    {
        int index = this.GetSet(data);
        foreach(TKey key in this.keySets[index])
        {
            yield return this.items[key];
        }
    }
}

Important: I know this code doesn't compile. It's the code I have to optimize.

There are two ways that I know of that would make this method working:

convert yield return part:

...
else
{
    int index = this.GetSet(data);
    return this.keySets[index].Select(key => this.items[key]);
}

convert return part:

if (this.isSingleSet)
{
    foreach(TItem item in this.singleSet)
    {
        yield return item;
    }
}
else ...

But there's a big speed difference between the two. Using only return statements (in other words using Select()) is much much slower (like 6 times slower) to yield return conversion.

Question

Is there any other way that comes to your mind how to write this method? Do you have any other suggestions information that would be valuable to performance discrepancy?

Additional info

I was measuring speed of the two methods by using stopwatch around a for loop.

Stopwatch s = new Stopwatch();
s.Start();
for(int i = 0; i < 1000000; i++)
{
    GetItems(GetRandomData()).ToList();
}
s.Stop();
Console.WriteLine(s.ElapsedMilliseconds);

Each of these were loops were run in separate processes so there was could be no performance influence by garbage collection or anything else.

I've run the program with one method version then
Closed it
Rewrote the method and run it again.

Did this few times to see reliable performance difference...

Solution

Use two functions. The outer, called by clients, function does all the non-lazy bits (like parameter validation) that you don't want to delay. The private worker does the lazy bits:

public IEnumerable<TItem> GetItems(int data) {
  if (this.isSingleSet) {
    return this.singleSet; // is IEnumerable per-se
  } else {
    return DoGetItems(data);
  }
}

private IEnumerable<TItem> DoGetItems(int data) {
  int index = this.GetSet(data);
  foreach(TKey key in this.keySets[index]) {
    yield return this.items[key];
  }
}

OTHER TIPS

The implementation of Select is (with the error checking removed):

public static IEnumerable<R> Select<A, R>(
    this IEnumerable<A> sequence, 
    Func<A, R> projection)
{
    foreach(A item in sequence) 
        yield return projection(item);
}

So I have a hard time believing that your using Select is hugely slower than the almost-identical foreach loop you already have. It will be slowed down by doing error checking (once) and by creating the delegate (once), and the slight overhead of indirecting through the delegate. But the loop machinery should be identical.

However, if I've learned one thing in performance analysis, it's that my expectations are frequently dead wrong. What does your profiling run indicate is the bottleneck in your application? Let's reason from facts, not from guesses here. What is the hot spot?

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow