سؤال

This kernel using two __restrict__ int arrays compiles fine:

__global__ void kerFoo( int* __restrict__ arr0, int* __restrict__ arr1, int num )
{
    for ( /* Iterate over array */ )
        arr1[i] = arr0[i];  // Copy one to other
}

However, the same two int arrays composed into a pointer array fails compilation:

__global__ void kerFoo( int* __restrict__ arr[2], int num )
{
    for ( /* Iterate over array */ )
        arr[1][i] = arr[0][i];  // Copy one to other
}

The error given by the compiler is:

error: invalid use of `restrict'

I have certain structures that are composed as an array of pointers to arrays. (For example, a struct passed to the kernel that has int* arr[16].) How do I pass them to kernels and be able to apply __restrict__ on them?

هل كانت مفيدة؟

المحلول

The CUDA C manual only refers to the C99 definition of __restrict__, no special CUDA-specific circumstances.

Since the indicated parameter is an array containing two pointers, this use of __restrict__ looks perfectly valid to me, no reason for the compiler to complain IMHO. I would ask the compiler author to verify and possibly/probably correct the issue. I'd be interested in different opinions, though.

One remark to @talonmies:

The whole point of restrict is to tell the compiler that two or more pointer arguments will never overlap in memory.

This is not strictly true. restrict tells the compiler that the pointer in question, for the duration of its lifetime, is the only pointer through which the pointed-to object can be accessed. Be aware that the object pointed to is only assumed to be an array of int. (In truth it's only one int in this case.) Since the compiler cannot know the size of the array, it is up to the programmer to guard the array's boundaries..

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top