Question

Suppose we have the following code which counts the number of times something occurs:

int i=0;
void f() {
   // do stuff  . . .
   if(something_happens) ++i;
}

int main() {
    std::vector<std::thread> threads;
    for(int j = 0; j< std::thread::hardware_concurrency(); ++j) {
        threads.push_back(std::thread(f));
    }

    std::for_each(threads.begin(), threads.end(), std::mem_fn(&std::thread_join));
    std::cout << "i = " << i << '\n';
}

As it stands there is a clear race condition on i. Using C++11, what is (1) the easiest method for eliminating this race condition, and (2) the fastest method?, preferably without using mutexes. Thanks.

Update: Using the comment to use atomics, I got a working program which compiles under the Intel Compiler, version 13:

#include <iostream>
#include <thread>
#include <vector>
#include <atomic>
#include <algorithm>

std::atomic<unsigned long long> i = 0;

void f(int j) {
    if(j%2==0) {
        ++i;
    }  
}

int main() {
    std::cout << "Atomic i = " << i << "\n";
    int numThreads = 8; //std::thread::hardware_concurrency() not yet implemented by Intel
    std::vector<std::thread> threads;
    for(int k=0; k< numThreads; ++k) {
        threads.push_back(std::thread(f, k));
    }

    std::for_each(threads.begin(), threads.end(), std::mem_fn(&std::thread::join));
        std::cout << "Atomic i = " << i << "\n";
    }
Was it helpful?

Solution

You might want to look into atomic types. You can access them without a need for a lock/mutex.

OTHER TIPS

We solved a similiar problem by declaring an array[nThreads], we then gave every thread an id ranging from 0-n the thread then can write safely at its position in the array. Then you can just sum the array to get the total sum. This however is only helpful as long as you dont need to sum the array before all the threads are dead.

To be even more efficient we had a local counter at each thread which we then before the thread died appended to the array.

example (pseudo code:)

counter[nThreads];

thread(int id)
{
    // Do stuff
    if(something happened)
       counter[id]++;   
}

or

counter[nThreads];

thread(int id)
{
    int localcounter = 0;
    //Do stuff
    if(something happened)
       localcounter++;   

    //Thread is about to die
    counter[id] = localcounter;
}

You can use the InterlockedIncrement function.

Many functions to mutate variables in atomic ways are documented on MSDN under the banner of Synchronization Functions - they may be useful to you.

it is not just a race condition, you could have no communication of the actual i value at all between threads if your compiler decides so.

obviously atomic is the good way. mutex is also a good way, when you don't have collisions they are as fast as atomics. They get slower only when they actually need to make the kernel fiddle with sleeping and ready thread queues. What can get taxy is if the wait signal doesn't use a condition variable in which case you might have to wait for a schedule kernel tick for your ready threads to be running, which can be very long (30ms).

atomics will get you optimality though, and may even be easier to maintain than condition variables thanks to not having to care about spurious events and notify_one versus notify_all etc.

If you check C++11-made STL's shared_ptr base classes, they contain a base_count or (base_shared_count or something) that works exactly like you need. You may also check the new boost::shared_count implementation if you will.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top