Why does capturing a mutable struct variable inside a closure within a using statement change its local behavior?

StackOverflow https://stackoverflow.com/questions/4642665

Question

Update: Well, now I've gone and done it: I filed a bug report with Microsoft about this, as I seriously doubt that it is correct behavior. That said, I'm still not 100% sure what to believe regarding this question; so I can see that what is "correct" is open to some level of interpretation.

My feeling is that either Microsoft will accept that this is a bug, or else respond that the modification of a mutable value type variable within a using statement constitutes undefined behavior.

Also, for what it's worth, I have at least a guess as to what is happening here. I suspect that the compiler is generating a class for the closure, "lifting" the local variable to an instance field of that class; and since it is within a using block, it's making the field readonly. As LukeH pointed out in a comment to the other question, this would prevent method calls such as MoveNext from modifying the field itself (they would instead affect a copy).


Note: I have shortened this question for readability, though it is still not exactly short. For the original (longer) question in its entirety, see the edit history.

I have read through what I believe are the relevant sections of the ECMA-334 and cannot seem to find a conclusive answer to this question. I will state the question first, then provide a link to some additional comments for those who are interested.

Question

If I have a mutable value type that implements IDisposable, I can (1) call a method that modifies the state of the local variable's value within a using statement and the code behaves as I expect. Once I capture the variable in question inside a closure within the using statement, however, (2) modifications to the value are no longer visible in the local scope.

This behavior is only apparent in the case where the variable is captured inside the closure and within a using statement; it is not apparent when only one (using) or the other condition (closure) is present.

Why does capturing a variable of a mutable value type inside a closure within a using statement change its local behavior?

Below are code examples illustrating items 1 and 2. Both examples will utilize the following demonstration Mutable value type:

struct Mutable : IDisposable
{
    int _value;
    public int Increment()
    {
        return _value++;
    }

    public void Dispose() { }
}

1. Mutating a value type variable within a using block

using (var x = new Mutable())
{
    Console.WriteLine(x.Increment());
    Console.WriteLine(x.Increment());
}

The output code outputs:

0
1

2. Capturing a value type variable inside a closure within a using block

using (var x = new Mutable())
{
    // x is captured inside a closure.
    Func<int> closure = () => x.Increment();

    // Now the Increment method does not appear to affect the value
    // of local variable x.
    Console.WriteLine(x.Increment());
    Console.WriteLine(x.Increment());
}

The above code outputs:

0
0

Further Comments

It has been noted that the Mono compiler provides the behavior I expect (changes to the value of the local variable are still visible in the using + closure case). Whether this behavior is correct or not is unclear to me.

For some more of my thoughts on this issue, see here.

Was it helpful?

Solution

It's a known bug; we discovered it a couple years ago. The fix would be potentially breaking, and the problem is pretty obscure; these are points against fixing it. Therefore it has never been prioritized high enough to actually fix it.

This has been in my queue of potential blog topics for a couple years now; perhaps I ought to write it up.

And incidentally, your conjecture as to the mechanism that explains the bug is completely accurate; nice psychic debugging there.

So, yes, known bug, but thanks for the report regardless!

OTHER TIPS

This has to do with the way closure types are generated and used. There appears to be a subtle bug in the way csc uses these types. For example, here is the IL generated by Mono's gmcs when invoking MoveNext():

      IL_0051:  ldloc.3
      IL_0052:  ldflda valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<int32> Foo/'<Main>c__AnonStorey0'::enumerator
      IL_0057:  call instance bool valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<int32>::MoveNext()

Note that it's loading the field's address, which allows the method call to modify the instance of the value type stored on the closure object. This is what I would consider to be correct behavior, and this results in the list contents being enumerated just fine.

Here's what csc generates:

      IL_0068:  ldloc.3
      IL_0069:  ldfld valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<int32> Tinker.Form1/'<>c__DisplayClass3'::enumerator
      IL_006e:  stloc.s 5
      IL_0070:  ldloca.s 5
      IL_0072:  call instance bool valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<int32>::MoveNext()

So in this case it's taking a copy of the value type instance and invoking the method on the copy. It should be no surprise why this gets you nowhere. The get_Current() call is similarly wrong:

      IL_0052:  ldloc.3
      IL_0053:  ldfld valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<int32> Tinker.Form1/'<>c__DisplayClass3'::enumerator
      IL_0058:  stloc.s 5
      IL_005a:  ldloca.s 5
      IL_005c:  call instance !0 valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<int32>::get_Current()
      IL_0061:  call void class [mscorlib]System.Console::WriteLine(int32)

Since the state of the enumerator it's copying has not had MoveNext() called, get_Current() apparently returns default(int).

In short: csc appears to be buggy. It's interesting that Mono got this right while MS.NET did not!

...I'd love to hear Jon Skeet's comments on this particular oddity.


In a discussion with brajkovic in #mono, he determined that the C# language specification does not actually detail how the closure type should be implemented, nor how accesses of locals that are captured in the closure should get translated. An example implementation in the spec seems to use the "copy" method that csc uses. Therefore either compiler output can be considered correct according to the language specification, though I would argue that csc should at least copy the local back to the closure object after the method call.

EDIT - This is incorrect, I didn't read the question carefully enough.

Placing the struct into a closure causes an assignment. Assignments on value types result in a copy of the type. So what's happening is you are creating a new Enumerator<int>, and Current on that enumerator will return 0.

using System;
using System.Collections.Generic;

class Program
{
    static void Main(string[] args)
    {
        List<int> l = new List<int>();
        Console.WriteLine(l.GetEnumerator().Current);
    }
}

Result: 0

The problem is the enumerator is stored in another class so every action is working with a copy of the enumerator.

[CompilerGenerated]
private sealed class <>c__DisplayClass3
{
    // Fields
    public List<int>.Enumerator enumerator;

    // Methods
    public int <Main>b__1()
    {
        return this.enumerator.Current;
    }
}

public static void Main(string[] args)
{
    List<int> <>g__initLocal0 = new List<int>();
    <>g__initLocal0.Add(1);
    <>g__initLocal0.Add(2);
    <>g__initLocal0.Add(3);
    List<int> list = <>g__initLocal0;
    Func<int> CS$<>9__CachedAnonymousMethodDelegate2 = null;
    <>c__DisplayClass3 CS$<>8__locals4 = new <>c__DisplayClass3();
    CS$<>8__locals4.enumerator = list.GetEnumerator();
    try
    {
        if (CS$<>9__CachedAnonymousMethodDelegate2 == null)
        {
            CS$<>9__CachedAnonymousMethodDelegate2 = new Func<int>(CS$<>8__locals4.<Main>b__1);
        }
        while (CS$<>8__locals4.enumerator.MoveNext())
        {
            Console.WriteLine(CS$<>8__locals4.enumerator.Current);
        }
    }
    finally
    {
        CS$<>8__locals4.enumerator.Dispose();
    }
}

Without the lambda the code is closer to what you would expect.

public static void Main(string[] args)
{
    List<int> <>g__initLocal0 = new List<int>();
    <>g__initLocal0.Add(1);
    <>g__initLocal0.Add(2);
    <>g__initLocal0.Add(3);
    List<int> list = <>g__initLocal0;
    using (List<int>.Enumerator enumerator = list.GetEnumerator())
    {
        while (enumerator.MoveNext())
        {
            Console.WriteLine(enumerator.Current);
        }
    }
}

Specific IL

L_0058: ldfld valuetype [mscorlib]System.Collections.Generic.List`1/Enumerator<int32> Machete.Runtime.Environment/<>c__DisplayClass3::enumerator
L_005d: stloc.s CS$0$0001
L_005f: ldloca.s CS$0$0001
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top