Question

I've got a tree-like structure. Each element in this structure should be able to return a Enumerable of all elements it is root to. Let's call this method IEnumerable<Foo> GetAll(). So if we have

      A <-- topmost root
    /   \
   B     C
  / \   / \
  D  E  F  G

a call to GetAll on element C returns {C, F, G} (fixed order of elements would be nice, but is not needed). I guess everybody knew that already.

The current implementation of GetAll looks like this:

public IEnumerable<Foo> GetAll ()
{
    yield return this;

    foreach (Foo foo in MyChildren) {
        foreach (Foo f in foo.GetAll ()) {
            yield return f;
        }
    }
}

In an earlier implementation, I returned a List and added the child-foos using List.AddRange().

My question is if the version using yield is correcly implemented or if it should be improved (esp. in terms of performance). Or is this just bad and I should stick to Lists (or ReadOnlyCollections) instead?

Was it helpful?

Solution

It's certainly not ideal in terms of performance - you end up creating a lot of iterators for large trees, instead of a single iterator which knows how to traverse efficiently.

Some blog entries concerning this:

It's worth noting that F# has the equivalent of the proposed "yield foreach" with "yield!"

OTHER TIPS

You can improve performance if you unroll recurse to stack, so you will have only one iterator:

public IEnumerable<Foo> GetAll()
{
    Stack<Foo> FooStack = new Stack<Foo>();
    FooStack.Push(this);

    while (FooStack.Count > 0)
    {
        Foo Result = FooStack.Pop();
        yield return Result;
        foreach (Foo NextFoo in Result.MyChildren)
            FooStack.Push(NextFoo);
    }
}

A better solution might be to create a visit method that recursively traverses the tree, and use that to collect items up.

Something like this (assuming a binary tree):

public class Node<T>
{
    public void Visit(Action<T> action)
    {
        action(this);
        left.Visit(action);
        right.Visit(action);
    }

    public IEnumerable<Foo> GetAll ()
    {
        var result = new List<T>();
        Visit( n => result.Add(n));
        return result;
    }
}

Taking this approach

  • Avoids creating large numbers of nested iterators
  • Avoids creating any more lists than necessary
  • Is relatively efficient
  • Falls down if you only need part of the list regularly

No, that looks fine.

Have a look at my blog entry, it may be of some use :)

According to my prior experiences using yield is a lot more effective than creating a List. If you're using .NET 3.5, this implementation should be fine. But don't forget the

yield break;

At the end. :-)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top