Question

recently I have been studying C++. And the prime number finder is an exercise. After I wrote the serial version code, I tried to use openmp to parallelize the code. However, the serial version works fine but once I tried openmp version, the output (prime numbers between 1 and 100) is very wrong. Any suggestion?

Here is my code (the relative functions):

    void prime2_bits(const int& n)
    {
       omp_set_num_threads(4);
       int i,j,tp,sq=sqrt(n+1);
       int p[n/32+1];
       initint(p,n/32+1);
    #pragma omp parallel for default(shared) private(i,j,tp) schedule(dynamic)
       for(i=2;i<sq;i++){
         if(!((p[i/32]>>(i%32))&1)){ // see whether the ith bit is 0
           tp=i*i;
           for(j=tp;j<n+1;j+=i){
             p[j/32]|=(1<<(j%32)); // set the jth bit as 1
           }
         }
       }
    // Do the printing job
       for(i=2;i<n+1;i++)
         if(!((p[i/32]>>(i%32))&1)) cout<<i<<' ';
    }

and the output looks like:

using method 2 (bits to save space):
2 3 5 7 11 13 15 17 19 21 23 27 29 31 35 37 41 43 47 49 53 55 59 61 67 69 71 73 77 79 81 83 87 89 91 93 97 99 0 msec used!

Also, to increase efficiency, I modified the serial code again. Then tried to parallel it with openmp without success. Any suggestion for this?

the modified code:

void prime3_norev(const int& n)
{
   int i,j,tp,m=n+1;
   int pi=0;
   bool pFlag[m];
   int p[n/3];
   omp_set_num_threads(4);
   initbool(pFlag,m);
   initint(p,n/3);
#pragma omp parallel for default(shared) private(i,j) schedule(dynamic) 
   for(i=2;i<m;i++){
     if(!pFlag[i]) p[pi++]=i;
     for(j=0;(j<pi)&&(i*p[j]<m);j++){
       pFlag[i*p[j]]=true;
       if(i%p[j]==0) break;
     }
   }
// Do the printing job
   for(i=0;i<pi;i++)
     cout<<p[i]<<' ';
}

and the corresponding output:

using method 3 (no revisits, space for time):
2 3 6 7 11 13 15 19 23 29 31 35 37 41 43 47 53 55 59 61 65 67 71 73 79 83 85 89 95 97 0 msec used!

The code was compiled with:

g++ -O2 -fopenmp fib.cc

Thank you very much!

Was it helpful?

Solution

To fix your code you can use the ordered clause

#pragma omp parallel for schedule(dynamic) ordered
for(int i=2;i<m;i++) {
    #pragma omp ordered
    if(!pFlag[i]) p[pi++]=i;
    //rest of code

For a working example see http://coliru.stacked-crooked.com/a/fa1eebf126940fb0. I used schedule(static) there only to show the error easier without the ordered clause.

I'm not sure what effect ordered will have on performance though. It looks to me like you're trying to parallelize the Sieve of Eratosthenes. In that case I suggest you check out http://create.stephan-brumme.com/eratosthenes/. It finds all the primes up to 10^9 in about 1 second using OpenMP.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top