Yes, thrust::sort
can be combined with the thrust::seq
execution policy to sort numbers sequentially within a single CUDA thread (or sequentially within a single CPU thread):
#include <thrust/sort.h>
#include <thrust/execution_policy.h>
__global__
void mykernel(float* array, int arrayLength)
{
int threadID = blockIdx.x * blockDim.x + threadIdx.x;
// each thread sorts array
// XXX note this causes a data race
thrust::sort(thrust::seq, array, array + arrayLength);
}
Note that your example causes a data race because each CUDA thread attempts to sort the same data in parallel. A correct race-free program would partition array
according to thread index.
The thrust::seq
execution policy, which is required for this feature, is only available in Thrust v1.8 or better.