質問

While working on a project, I wrote an iterator block similar to the following:

public class Sequence<T> : IEnumerable<T>
{
    public T Head{get; private set;}
    public Sequence<T> Tail {get; private set;}

    public bool IsEmpty {get; private set;}

    public IEnumerator<T> GetEnumerator()
    {
        Sequence<T> collection = this;

        while (!collection.IsEmpty)
        {
            yield return collection.Head;
            collection = collection.Tail;
        }
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }
}

As you can see, I expected that after the second call to MoveNext, the GC would be able to collect the original collection, since the iterator block no longer holds a reference to it, only its tail (as seen in collection = collection.Tail).

However, this did not happen. I discovered the compiler-generated IEnumerator<T> will always hold a reference to the instance of Sequence<T> that created it.

To prove this, I wrote the following iterator block and inspected the generated IL:

public IEnumerator<T> GetEnumerator()
{
    yield return default(T);
}

To my surprise, the IL was equivalent to this:

public IEnumerator<T> GetEnumerator()
{
    var enumerator = new CompilerGeneratedEnumerator();
    enumerator.this_field = this;
}

Verbatim:

.maxstack 2
.locals init (
    [0] class Sequences.Sequence`1/'<GetEnumerator>d__3'<!T>
)

IL_0000: ldc.i4.0
IL_0001: newobj instance void class Sequences.Sequence`1/'<GetEnumerator>d__3'<!T>::.ctor(int32)
IL_0006: stloc.0
IL_0007: ldloc.0
IL_0008: ldarg.0
IL_0009: stfld class Sequences.Sequence`1<!0> class Sequences.Sequence`1/'<GetEnumerator>d__3'<!T>::'<>4__this'
IL_000e: ldloc.0
IL_000f: ret

By looking at the IL for <GetEnumerator>d__3, it seems the <>4__this field is never accessed. So why is it generated anyway? Why does the enumerator need to point to the instance of Sequence<T> that created it?

I was able to get around this problem by writing my own IEnumerator<T>, but I'm still wondering why this happens in the first place.


If you want to compile this yourself, you can grab the project's source from here: https://github.com/dcastro/Sequences

And here's the original iterator block:

ISequence<T> sequence = this;

while (!sequence.IsEmpty)
{
    yield return sequence.Head;
    sequence = sequence.Tail;
}
役に立ちましたか?

解決

Logically your first method should capture this. This line:

Sequence<T> collection = this;

... will only execute on the first call to MoveNext(), so it really does need to capture it, and it can only capture it in an instance variable in the generated code. The compiler could explicitly null it out after its final use, but usually that would just be wasteful.

Now your second case is more interesting. Yes, in order to complete the method it doesn't need a reference to this - but if you were in a debugger, and you had a breakpoint on the yield return statement, you would expect to be able to inspect this, as you're in an instance method. So at least in a build with debug information and no optimization, I think it's reasonable to include this as an instance variable. In an optimized build it would make sense not to capture this (and accept that if you're debugging a build not meant for debugging, there are some limitations) but I guess this is just an optimization the compiler authors didn't consider important.

他のヒント

If it's important that the original item is GCable after iteration has moved on, you could implement an IEnumerator<T> yourself, rather than having the compiler generate one for you.

The following compiles, but could probably be improved:

public class MyCollection<T> : IEnumerable<T>
{
    private T Head;
    private MyCollection<T> Tail;
    private bool IsEmpty;

    private class ThisEnumerator : IEnumerator<T>
    {
        public ThisEnumerator(MyCollection<T> toIterate)
        {
            _innerCurrent = toIterate;
        }
        private MyCollection<T> _innerCurrent;
        private bool _hasMoved = false;
        public T Current
        {
            get
            {
                return _innerCurrent.Head;
            }
        }

        object IEnumerator.Current
        {
            get
            {
                return this.Current;
            }
        }

        public void Dispose()
        {
            _innerCurrent = null;
        }

        public bool MoveNext()
        {
            if (_hasMoved)
            {
                _innerCurrent = _innerCurrent.Tail;
            }
            else
            {
                _hasMoved = true;
            }
            return !_innerCurrent.IsEmpty;
        }

        public void Reset()
        {
            throw new NotSupportedException();
        }
    }

    public IEnumerator<T> GetEnumerator()
    {
        return new ThisEnumerator(this);
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return this.GetEnumerator();
    }
}

As to the "why" question, as I indicated in a comment, I just think it's likely that most iterators (and async blocks, that share a lot of this machinery) will need to access this.

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top