Question

I'm writing a parallel version of the Longest Common Subsequence algorithm using openMP.

The sequential version is the following (and it works correctly):

// Preparing first row and first column with zeros
for(j=0; j < (len2+1); j++)
    score[0][j] = 0;

for(i=0; i < (len1+1); i++)
    score[i][0] = 0;

// Calculating scores
for(i=1; i < (len1+1); i++) {
    for(j=1; j < (len2+1) ;j++) {
        if (seq1[i-1] == seq2[j-1]) {
               score[i][j] = score[i-1][j-1] + 1;
        }
        else {
            score[i][j] = max(score[i-1][j], score[i][j-1]);
        }
    }
}

The critical part is filling up the score matrix and this is the part I'm trying to mostly parallelize.

One way to do it (which I chose) is: filling up the matrix by anti diagonals, so left, top and top-left dependecies are always satisfied. In a nutshell, I keep track of the diagonal (third loop, variable i below) and threads fill up that diagonal in parallel. For this purpose, I've written this code:

void parallelCalculateLCS(int len1, int len2, char *seq1, char *seq2) {

int score[len1 + 1][len2 + 1];
int i, j, k, iam;
char *lcs = NULL;

for(i=0;i<len1+1;i++)
    for(j=0;j<len2+1;j++)
        score[i][j] = -1;

#pragma omp parallel default(shared) private(iam)
{
    iam = omp_get_thread_num();
// Preparing first row and first column with zeros
    #pragma omp for
    for(j=0; j < (len2+1); j++)
        score[0][j] = iam;

    #pragma omp for
    for(i=0; i < (len1+1); i++)
        score[i][0] = iam;

// Calculating scores
    for(i=1; i < (len1+1); i++) {
        k=i;
        #pragma omp for
        for(j=1; j <= i; j++) {
            if (seq1[k-1] == seq2[j-1]) {
                //  score[k][j] = score[k-1][j-1] + 1;
                score[k][j] = iam;
            }
            else {
            //  score[k][j] = max(score[k-1][j], score[k][j-1]);
                score[k][j] = iam;
            }
            #pragma omp atomic
            k--;
        }
    }

}
}

The first two loops (first row and column) work correctly and threads fill up cells in a balanced way.

When it comes to fill up the matrix (diagonally), nothing works well. I tried to debug it, but it seems that threads act and write things randomly.

I can't figure out what's going wrong, since in the first two loops there were no problems at all.

Any idea?

P.S. I know that accessing matrix in a diagonal way is very cache-unfriendly and threads could be unbalanced, but I only need it to work by now.

P.S. #2 I don't know if it could be useful, but my CPU has up to 8 threads.

Was it helpful?

Solution

#pragma omp atomic means that the processors will perform the operation one at a time. You are looking for #pragma omp for private(k) : the processors will no longer share the same value. Bye, Francis

OTHER TIPS

The following nested for loop

 #pragma omp for
 for(j=1; j <= i; j++) 

will be executed in parallel, each thread with a different value of j in no specific order. As nothing is specified in the omp for section, k will be shared by default between all threads. So depending on the order of the threads, k will be decremented at an unknown time (even with the omp atomic). So for a fixed j, the value of k might change during the execution of the body of the for loop (between the if clauses, ...).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top