Question

I am making a tool that runs on a large directory extracting data and starts a thread per language code (the first level of files in the directory). I added a loop that stops the threads from adding to the database until all threads are finished because the tool was going into deadlock without it. However, when testing this data, the DB is storing the wrong amount of languages even though the test data is static. For example I have 67 languages, but my DB only has 48 in it. I think the issue may be that my loop for stopping the program from proceeding before the threads have stopped may be broken, ie. it is adding files to the DB before all threads have stopped, thus losing languages along the way. I don't suppose anyone has come across a similar issue or knows of a way to solve this problem? Thanks.

 //get the languages from the folders
        string[] filePaths = Directory.GetDirectories(rootDirectory);
        for (int i = 0; i < filePaths.Length; i++)
        {
            string LCID = filePaths[i].Split('\\').Last();
            Console.WriteLine(LCID);
            //go through files in each folder and sub-folder with threads
            Thread t1 = new Thread(() => new HBScanner(new DirectoryInfo(filePaths[i - 1])).HBscan());
            t1.Start();
            threads.Add(t1);
        }

        // wait for all threads to complete before proceeding
        foreach (Thread thread in threads)
        {
            while (thread.ThreadState != ThreadState.Stopped)
            {
                //wait
            }
        }
Was it helpful?

Solution

First and foremost: make a local copy of the path and pass that into the thread instead of the for-loop variable. Closing over the loop variable is considered harmful.

I don't know why you get an index out of range exception, but you can also avoid that by using a foreach-loop.

//get the languages from the folders
string[] filePaths = Directory.GetDirectories(rootDirectory);
foreach(string filePath in filePaths)
{
    Console.WriteLine(filePath.Split('\\').Last());

    string tmpPath = filePath; // <-- local copy

    //go through files in each folder and sub-folder with threads
    Thread t1 = new Thread(() => new HBScanner(new DirectoryInfo(tmpPath)).HBscan());
    t1.Start();
    threads.Add(t1);
}

Second: use Join on the threads instead of having some custom wait code.

// wait for all threads to complete before proceeding
foreach (Thread thread in threads)
{
    thread.Join();
}

Finally, make sure that there is no contention on the database. Without further information on what the HBScanner does, it's difficult to say anything else about what might be causing this issue.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top