Question

I have a simple C program for N-queens calculation. I parallelized it using OpenMP. Now I want to execute both the serial and parallel versions and calculate the speed up. The point is that I don't want to create a new file for the serial code, or just copy my solution to a new function without the OpenMP directive. What I want to do is, keep one function, and tell from the main when to execute it as a serial and when as a parallel. I though of using preprocessors but I'm sure whether it is possible, and if yes, how can I achieve it.

void solve() 
{
    int i;

    #if PARALLEL == 1
      #pragma omp parallel for
    #endif
    for(i = 0; i < size; i++) {
        int *queens = (int*)malloc(sizeof(int)*size);
        setQueen(queens, 0, i);
        free(queens);
    }
}


int main()
{
   ...

    #define PARALLEL 0
    st_start = clock();
    solve();
    st_end = clock();

    #define PARALLEL 1
    pt_start = omp_get_wtime();
    solve();
    pt_end = omp_get_wtime();

    ...
}
Was it helpful?

Solution 2

Unluckily, you can't do it like that.

Preprocessor just scans over your code and REPLACES the #stuff. After this is done, the compiler compiles the code and there's nothing with #this

So at the code you posted, the preprocessor starts at the first line, does the #pragma stuff to code if PARALLEL is 1 and then continues at main, defining PARALLEL to 0 and then to 1.

It does NOT start at main and then get into solve();

You might want to take a look at OpenMP: conditional use of #pragma

You could try

void solve(int paral) 
{
    int i;


    #pragma omp parallel for if (paral == 1)

    for(i = 0; i < size; i++) {
        int *queens = malloc(sizeof(int)*size);
        setQueen(queens, 0, i);
        free(queens);
    }
}

I haven't tried this code and I'm not experienced with OMP, though...

OTHER TIPS

Edit: I thought of a way to do this using the preprocessor. This fixes the problem of duplicate code at the expense of making the compiling and linking slightly more complicated. It uses the feature that if OpenMP is not enabled in the compiler then the OpenMP constructs are ignored.

#include <stdlib.h>

void setQueen(int* x, int y, int z) {
   /*code*/
}
#if defined _OPENMP
void solve_parallel(const int size)
#else
void solve_serial(const int size)
#endif
{
    int i;
    #pragma omp parallel for      
    for(i = 0; i < size; i++) {
        int *queens = (int*)malloc(sizeof(int)*size);
        setQueen(queens, 0, i);
        free(queens);
    }
}

Compile with

gcc -O3 -c foo.c -o solve_serial
gcc -O3 -fopenmp -c foo.c solve_parallel

Then you can use a main funciton similar to the one below with function pointers and link in the solve_serial and solve_parallel object files.

Another option is to pass the number of threads like this:

void solve(const int nthreads)
{
    int i;
    const int size = 10;
    #pragma omp parallel for num_threads(nthreads)
    for(i = 0; i < size; i++) {
        int *queens = (int*)malloc(sizeof(int)*size);
        setQueen(queens, 0, i);
        free(queens);
    }
}

However, even for nthreads=1 the compiler has to insert OpenMP constructs which can reduce the performance compared to not compiling with OpenMP and therfore can give a biased compairison.

A more fair solution is to define two functions with and without OpenMP and then use an array of function pointers (see below). This is more useful when you have several variations of a function you want to compare for optimization.

#include <stdlib.h>
#include <omp.h>
void solve_parallel(const int size)
{
    int i;
    #pragma omp parallel for
    for(i = 0; i < size; i++) {
        int *queens = (int*)malloc(sizeof(int)*size);
        setQueen(queens, 0, i);
        free(queens);
    }
}

void solve_serial(const int size)
{
    int i;
    for(i = 0; i < size; i++) {
        int *queens = (int*)malloc(sizeof(int)*size);
        setQueen(queens, 0, i);
        free(queens);
    }
}

int main(void) {
    const int size = 100;
    int i;
    double dtime[2];
    void (*solve[2])(int);

    solve[0] = solve_serial;
    solve[1] = solve_parallel;

    solve[1](size);  /* run OpenMP once to warm it up */

    for(i=0; i<2; i++) {
        dtime[i] = omp_get_wtime();
        solve[i](size);
        dtime[i] = omp_get_wtime() - dtime[i];
    }

    return 0;
}

calling the preprocessor which is the first part of the compilation process; that's there all the inclusions are solved, all the prepocessor directives are resolved, constants replaced with their values and so on...

So you cannot use preprocessor directives to take runtime decisions, you can only take compile time decisions.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top