Frage

In this example code I do a summation from i=0 to i=n and then add the result to itself k times, where k is the number of threads. I purposely did it without critical (surrounding the printf and ans += ans) to cause race conditions. However, to my surprise, no race condition happened:

int summation_with_operation_after_it_wrong1(int n, int k) {
        int ans = 0;
        #pragma omp parallel firstprivate(n) num_threads(k)
        {
                int i; /* Private */
                #pragma omp for schedule(dynamic) reduction(+:ans)
                for (i = 0; i < n; i++) {
                        ans += i;
                }
                printf("Thread %d ans=%d\n", omp_get_thread_num(), ans);
                ans += ans;    
        }
        return ans;
}

Using n=10 and k=4, the output is (always the same, except for thread order):

Thread 1 ans=45
Thread 3 ans=45
Thread 0 ans=45
Thread 2 ans=45
720

However, I did noticed something odd about it. ans was always 45, instead of

Thread 3 ans=45
Thread 0 ans=90
Thread 2 ans=180
Thread 1 ans=360
720

When using critical. So I moved the printf to after the ans += ans to see what it was doing, and, for my surprise, the predicted race conditions started to occur all the time!

Thread 3 ans=90
Thread 1 ans=135
Thread 2 ans=90
Thread 0 ans=135
135

So... How does the printf prevented race conditions? And how does that sum ended up to be 720? I'm completely lost here.

War es hilfreich?

Lösung

Section 1.4 of the latest OpenMP standard specifies what is the result of a race condition (emphasis mine):

If multiple threads write without synchronization to the same memory unit, including cases due to atomicity considerations as described above, then a data race occurs. Similarly, if at least one thread reads from a memory unit and at least one thread writes without synchronization to that same memory unit, including cases due to atomicity considerations as described above, then a data race occurs. If a data race occurs then the result of the program is unspecified.

What you notice is completely consistent with the statement in bold. In fact, as the behavior in a program containing a data-race is unspecified, it makes little sense to argue why a particular output results from a given run. In particular, it is only by chance that you obtained 720 when inserting a printf before the ans+=ans command, and there's no guarantee that you will always encounter the same behavior.

Andere Tipps

printf() is a very expensive call to make, and it's not a surprise that using it changes your race condition timing. A better option to see what's happening is to create an array (in advance) to store your results and have each thread deposit its result into this array where you're currently performing the print; then do the actual printf() after all the work has completed.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top