The foreach identifier and closures

https://stackoverflow.com/questions/512166

21-08-2019
|

Question

In the two following snippets, is the first one safe or must you do the second one?

By safe I mean is each thread guaranteed to call the method on the Foo from the same loop iteration in which the thread was created?

Or must you copy the reference to a new variable "local" to each iteration of the loop?

var threads = new List<Thread>();
foreach (Foo f in ListOfFoo)
{      
    Thread thread = new Thread(() => f.DoSomething());
    threads.Add(thread);
    thread.Start();
}

var threads = new List<Thread>();
foreach (Foo f in ListOfFoo)
{      
    Foo f2 = f;
    Thread thread = new Thread(() => f2.DoSomething());
    threads.Add(thread);
    thread.Start();
}

Update: As pointed out in Jon Skeet's answer, this doesn't have anything specifically to do with threading.

Solution

Edit: this all changes in C# 5, with a change to where the variable is defined (in the eyes of the compiler). From C# 5 onwards, they are the same.

Before C#5

The second is safe; the first isn't.

With foreach, the variable is declared outside the loop - i.e.

Foo f;
while(iterator.MoveNext())
{
     f = iterator.Current;
    // do something with f
}

This means that there is only 1 f in terms of the closure scope, and the threads might very likely get confused - calling the method multiple times on some instances and not at all on others. You can fix this with a second variable declaration inside the loop:

foreach(Foo f in ...) {
    Foo tmp = f;
    // do something with tmp
}

This then has a separate tmp in each closure scope, so there is no risk of this issue.

Here's a simple proof of the problem:

    static void Main()
    {
        int[] data = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
        foreach (int i in data)
        {
            new Thread(() => Console.WriteLine(i)).Start();
        }
        Console.ReadLine();
    }

Outputs (at random):

Add a temp variable and it works:

        foreach (int i in data)
        {
            int j = i;
            new Thread(() => Console.WriteLine(j)).Start();
        }

(each number once, but of course the order isn't guaranteed)

OTHER TIPS

Pop Catalin and Marc Gravell's answers are correct. All I want to add is a link to my article about closures (which talks about both Java and C#). Just thought it might add a bit of value.

EDIT: I think it's worth giving an example which doesn't have the unpredictability of threading. Here's a short but complete program showing both approaches. The "bad action" list prints out 10 ten times; the "good action" list counts from 0 to 9.

using System;
using System.Collections.Generic;

class Test
{
    static void Main() 
    {
        List<Action> badActions = new List<Action>();
        List<Action> goodActions = new List<Action>();
        for (int i=0; i < 10; i++)
        {
            int copy = i;
            badActions.Add(() => Console.WriteLine(i));
            goodActions.Add(() => Console.WriteLine(copy));
        }
        Console.WriteLine("Bad actions:");
        foreach (Action action in badActions)
        {
            action();
        }
        Console.WriteLine("Good actions:");
        foreach (Action action in goodActions)
        {
            action();
        }
    }
}

Your need to use option 2, creating a closure around a changing variable will use the value of the variable when the variable is used and not at closure creation time.

The implementation of anonymous methods in C# and its consequences (part 1)

The implementation of anonymous methods in C# and its consequences (part 2)

The implementation of anonymous methods in C# and its consequences (part 3)

Edit: to make it clear, in C# closures are "lexical closures" meaning they don't capture a variable's value but the variable itself. That means that when creating a closure to a changing variable the closure is actually a reference to the variable not a copy of it's value.

Edit2: added links to all blog posts if anyone is interested in reading about compiler internals.

This is an interesting question and it seems like we have seen people answer in all various ways. I was under the impression that the second way would be the only safe way. I whipped a real quick proof:

class Foo
{
    private int _id;
    public Foo(int id)
    {
        _id = id;
    }
    public void DoSomething()
    {
        Console.WriteLine(string.Format("Thread: {0} Id: {1}", Thread.CurrentThread.ManagedThreadId, this._id));
    }
}
class Program
{
    static void Main(string[] args)
    {
        var ListOfFoo = new List<Foo>();
        ListOfFoo.Add(new Foo(1));
        ListOfFoo.Add(new Foo(2));
        ListOfFoo.Add(new Foo(3));
        ListOfFoo.Add(new Foo(4));


        var threads = new List<Thread>();
        foreach (Foo f in ListOfFoo)
        {
            Thread thread = new Thread(() => f.DoSomething());
            threads.Add(thread);
            thread.Start();
        }
    }
}

if you run this you will see option 1 is definetly not safe.

In your case, you can avoid the problem without using the copying trick by mapping your ListOfFoo to a sequence of threads:

var threads = ListOfFoo.Select(foo => new Thread(() => foo.DoSomething()));
foreach (var t in threads)
{
    t.Start();
}

Both are safe as of C# version 5 (.NET framework 4.5). See this question for details: Has foreach's use of variables been changed in C# 5?

Foo f2 = f;

points to the same reference as

So nothing lost and nothing gained ...

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow