You have a data race in your loop counters:
#pragma omp for
for (i=0; i<nx; i++) {
for (j=0; j<ny; j++) { // <--- data race
for (k=0; k<nz; k++) { // <--- data race
arr_par[i][j][k] = i*j + k;
}
}
}
Since neither j
nor k
are given the private
data-sharing class, their values might exceed the corresponding limits when several threads try to increase them at once, resulting in out-of-bound access to arr_par
. The chance to have several threads increase j
or k
at the same time increases with the number of iterations.
The best way to treat those cases is to simply declare the loop variables inside the loop operator itself:
#pragma omp for
for (int i=0; i<nx; i++) {
for (int j=0; j<ny; j++) {
for (int k=0; k<nz; k++) {
arr_par[i][j][k] = i*j + k;
}
}
}
The other way is to add the private(j,k)
clause to the head of the parallel region:
#pragma omp parallel default(shared) private(threadid) private(j,k)
It is not strictly necessary to make i
private in your case since the loop variable of parallel loops are implicitly made private. Still, if i
is used somewhere else in the code, it might make sense to make it private to prevent other data races.
Also, don't use clock()
to measure the time for parallel applications since on most Unix OSes it returns the total CPU time for all threads. Use omp_get_wtime()
instead.