Question

I have a 2D matrix SIZE x SIZE, which I'm trying to copy to the GPU.

I allocate the matrix this way:

#define SIZE 1024
float (*a)(SIZE) = (float(*)[SIZE]) malloc(SIZE * SIZE * sizeof(float));

And I have this on my ACC region:

void mmul_acc(restrict float a[][SIZE],
              restrict float b[][SIZE],
              restrict float c[][SIZE]) {
#pragma acc data copyin(a[0:SIZE][0:SIZE], b[0:SIZE][0:SIZE]) \
    copyout c[0:SIZE][0:SIZE])
{
  ... code here...
}

When compiling with the PGI compiler, using -Minfo=acc, the compiler tells me:

Generating copyin(a[0:1024][0:])

What does a[0:1024][0:] mean? Why not a[0:1024][0:1024] ???

If instead of declaring matrices I declare arrays with size SIZE*SIZE, doing

#pragma acc copyin(a[0:SIZE*SIZE])

Generates the following compiler message

Generating copyin(a[0:16777216])

The code actually works the same way, same performance, same result.

Apparently in both ways the compiler generates the same code, as it should be, but the message is not straightforward.

I'm using the PGI accelerator 12.8, in a Linux64 machine. I'm compiling with -Minfo=acc

Note: this question was edited and now it doesn't really make much sense, but maybe it can useful to more people.

Was it helpful?

Solution

This issue is fixed in latest PGI Compiler 12.9.0. The compiler now returns following messsage:

Generating copyin(a[0:1024][0:1024])
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top