سؤال

I implemented a simple matrix vector multiplication for sparse matrices in CRS using an implicit openMP directive in the multiplication loop.

The complete code is in GitHub: https://github.com/torbjoernk/openMP-Examples/blob/icc_gcc_problem/matxvec_sparse/matxvec_sparse.cpp
Note: It's ugly ;-)

To control the private and shared memory I'm using restrict pointers. Compiling it with GCC 4.6.3 on 64bit Linux works fine (besides two warnings about %u and unsigned int in a printf command, but that's not the point).

However, compiling it with ICC 12.1.0 on 64bit Linux failes with the error:

matxvec_sparse.cpp(79): error: "default_n_row" must be specified in a variable list at enclosing OpenMP parallel pragma
    #pragma omp parallel \
    ^

with the definition of the variable and pointer in question

int default_n_row = 4;
int *n_row = &default_n_row;

and the openMP directive defined as

#pragma omp parallel \
  default(none) \
  shared(n_row, aval, acolind, arowpt, vval, yval) \
  private(x, y)
{
  #pragma omp for \
    schedule(static)
  for ( x = 0; x < *n_row; x++ ) {
    yval[x] = 0;
    for ( y = arowpt[x]; y < arowpt[x+1]; y++ ) {
      yval[x] += aval[y] * vval[ acolind[y] ];
    }
  }
} /* end PARALLEL */

Compiled with g++:

c++ -fopenmp -O0 -g -std=c++0x -Wall -o matxvec_sparse matxvec_sparse.cpp

Compiled with icc:

icc -openmp -O0 -g -std=c++0x -Wall -restrict -o matxvec_sparse matxvec_sparse.cpp

  • Is it an error in usage of GCC/ICC?
  • Is this a design issue in my code causing undefined behaviour?
    If so, which line(s) is/are causing it?
  • Is it just inconsistency between ICC and GCC?
    If so, what would be a good way to achieve compiler independence and compatibility?
هل كانت مفيدة؟

المحلول

Huh. Looking at the code, it's clear what icpc thinks the problem is, but I'm not sure without going through the specification which compiler is doing the right thing here, g++ or icpc.

The issue isn't the restrict keyword; if you take all those out and lose the -restrict option to icpc, the problem remains. The issue is that you've got in that parallel section default(none) shared(n_row...), but n_row is, at the start of the program, a pointer to default_n_row. And icpc is requiring that default_n_row also be shared (or, at least, something) in that omp parallel section.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top