Question

I have a Matrix multiply code that does matrix multiply by the following Where Matrix A * Matrix B = Matrix C

for(j=1;j<=n;j++) {
 for(l=1;l<=k;l++) {
  for(i=1;i<=m;i++) {
   C[i][j] = C[i][j] + B[l][j]*A[i][l];

 }
}

Now I want to turn it into multi threaded matrix multiply and my code is as follows:

I use a struct

struct ij
{
 int rows;
 int columns;
};

my method is

void *MultiplyByThread(void *t)
{
 struct ij *RowsAndColumns = t;
 double total=0; 
 int pos; 
 for(pos = 1;pos<k;pos++)
 {
  fprintf(stdout, "Current Total For: %10.2f",total);
  fprintf(stdout, "%d\n\n",pos);
  total += (A[RowsAndColumns->rows][pos])*(B[pos][RowsAndColumns->columns]);
 }
 D[RowsAndColumns->rows][RowsAndColumns->columns] = total;
 pthread_exit(0);

}

and inside my main is

      for(i=1;i<=m;i++) {
        for(j=1;j<=n;j++) {

   struct ij *t = (struct ij *) malloc(sizeof(struct ij));
   t->rows = i;
   t->columns = j;

    pthread_t thread;
    pthread_attr_t threadAttr;
    pthread_attr_init(&threadAttr);
    pthread_create(&thread, &threadAttr, MultiplyByThread, t);    
    pthread_join(thread, NULL);    

        }
      }

But I can't seem to get the same result as the first matrix multiply (which is correct) can someone point me to the right direction?

Was it helpful?

Solution

Your threading code is not threaded, in fact. You create a thread and wait for it to complete by calling the join just after calling the create. You have to create a matrix of mxn threads, launch them all, and then join them all. Apart from that, the code seems to be calculating the same as the loop. What is the exact discrepancy with the results?

Example (note, not compiled):

pthread_t threads[m][n]; /* Threads that will execute in parallel */

and then in the main:

 for(i=1;i<=m;i++) {
    for(j=1;j<=n;j++) {

    struct ij *t = (struct ij *) malloc(sizeof(struct ij));
    t->rows = i;
    t->columns = j;

    pthread_attr_t threadAttr;
    pthread_attr_init(&threadAttr);
    pthread_create(thread[i][j], &threadAttr, MultiplyByThread, t);    
    }
  }

  /* join all the threads */
  for(i=1;i<=m;i++) {
    for(j=1;j<=n;j++) {
       pthread_join(thread[i][j], NULL);
    }
  }

(more or less, just not calling pthread_join for each thread inside the loop).

OTHER TIPS

Try the following:

#pragma omp for private(i, l, j)
for(j=1;j<=n;j++) {
    for(l=1;l<=k;l++) {
        for(i=1;i<=m;i++) {
            C[i][j] = C[i][j] + B[l][j]*A[i][l];
        }
    }
}

While Googling for the GCC compiler switch to enable OpenMP, I actually came across this blog post that describes what happens better than I could, and also contains a better example.

OpenMP is supported on most reasonably relevant compilers for multicore machines, see the OpenMP web site for more information.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top