Question

I'm trying to design a program that uses an external OCR application to flip an image until its right side up. All of the image locations are kept in files[].

The problem is, doing one file at a time is too slow to handle the tens of thousands of images I have. I need to launch several instances of the OCR program to scan multiples images at the same time.

My crappy implementation is the following:

public Program(string[] files)
    {
        for(int i = 0; i < files.Length; i++)
        {
            ThreadStart start = () => {flip(files[i]);};
            Thread t = new Thread(start);
            t.Start();
            if(i % 5 == 0)
            {
                t.Join();
            }
        }
    }

The code is supposed to launch 5 instances of the OCR program. On every fifth, it waits for the thread to close before continuing. This is supposed to act as a buffer.

However, what's happening instead is that repeating files are being passed into the OCR program instead of a different one for each iteration. Different threads are grabbing the same file. This causes a crash when the different instances of the OCR application go to work on the same file.

Does anyone have any idea whats going on, or know a completely different approach I can take?

Was it helpful?

Solution 2

The problem is that your lambda expression is capturing the variable i, rather than its value for that iteration of the loop.

There are two options:

Capture a copy

for (int i = 0; i < files.Length; i++)
{
    int copy = i;
    ThreadStart start = () => flip(files[copy]); // Braces aren't needed
    ...
}

Use foreach - C# 5 only!

This won't help as much in your case because you're joining on every fifth item, so you need the index, but if you didn't have that bit and if you were using C# 5, you could just use:

foreach (var file in files)
{
    ThreadStart start = () => flip(file);
    ...
}

Note that prior to C# 5, this would have had exactly the same problem.

For more details of the problem, see Eric Lippert's blog posts (part one; part two).

OTHER TIPS

You're suffering from a problem called accessing a modified closure. The value of i is changing as the threads are starting. Change the code to use a local variable instead.

        for (int i = 0; i < args.Length; i++)
        {
            int currenti = i;
            ThreadStart start = () => { flip(files[currenti]); };
            Thread t = new Thread(start);
            t.Start();
            if (i % 5 == 0)
            {
                t.Join();
            }
        }
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top