Including thrust/sort.h before curand_kernel.h gives compilation error

https://stackoverflow.com/questions/23352122

11-07-2023
|

Question

This code compiles fine:

#include <curand_kernel.h>
#include <thrust/sort.h>

int main(void) {
    return 0;
}

while this gives compilation errors:

#include <thrust/sort.h>
#include <curand_kernel.h>

int main(void) {
    return 0;
}

CUDA 6 gives me three errors:

... curand_mtgp32_kernel.h(315): error: calling a __device__ function("__syncthreads") from a __host__ __device__ function("curand_mtgp32_single") is not allowed

... include/curand_mtgp32_kernel.h(373): error: calling a __device__ function("__syncthreads") from a __host__ __device__ function("curand_mtgp32_single_specific") is not allowed

... curand_kernel.h(392): warning: missing return statement at end of non-void function "__curand_uint32_as_float"

on CUDA 5 I get only:

... curand_kernel.h(405): warning: missing return statement at end of non-void function "__curand_uint32_as_float"

where "..." is my CUDA install directory. It seems all these areas are wrapped in macros testing whether __CUDA_ARCH__ is defined, or what its value is. I am seeing this problem only with sort.h, not with the various other thrust headers I am including (host/device vectors, scan, scatter+gather and some fancy iterators).

I cannot (easily and sensibly) make all headers in my current project self-sufficient, given this behaviour. Can anyone explain to me why the order matters here, or if it is a bug or 'feature' of one of these headers?

Solution

It's a known issue (it's actually 2 separate issues). It was identified too close to the CUDA 6 release to fix in CUDA 6.

It should be fixed in a future release.

In the meantime, you should be able to work around it by reversing the order of inclusion of those header files, or else you can try updating to the current Thrust master branch.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow