调用手写CUDA内核推力

https://stackoverflow.com/questions/2398031

25-09-2019
|

题

，因为我需要与CUDA很多大的数组进行排序，我来到一起使用推力。到目前为止，一切都很好......但是当我想打电话给“手写”的内核，其包含数据的推力:: host_vector什么？

我的方法是（backcopy丢失）：

int CUDA_CountAndAdd_Kernel(thrust::host_vector<float> *samples, thrust::host_vector<int> *counts, int n) {

 thrust::device_ptr<float> dSamples = thrust::device_malloc<float>(n);
 thrust::copy(samples->begin(), samples->end(), dSamples);

 thrust::device_ptr<int> dCounts = thrust::device_malloc<int>(n);
 thrust::copy(counts->begin(), counts->end(), dCounts);

 float *dSamples_raw = thrust::raw_pointer_cast(dSamples);
 int *dCounts_raw = thrust::raw_pointer_cast(dCounts);

 CUDA_CountAndAdd_Kernel<<<1, n>>>(dSamples_raw, dCounts_raw);

 thrust::device_free(dCounts);
 thrust::device_free(dSamples);
}

内核看起来像：

__global__ void CUDA_CountAndAdd_Kernel_Device(float *samples, int *counts)

但是编译失败：

错误：类型的自变量“浮动**”是用类型的参数不相容 “推力:: host_vector> *”

咦？我还以为我给float和INT原料，三分球？还是我失去了一些东西？

解决方案

您正在呼吁与函数调用中，而不是内核的名字命名的内核 - 因此，参数不匹配

变化：

CUDA_CountAndAdd_Kernel<<<1, n>>>(dSamples_raw, dCounts_raw);

到

CUDA_CountAndAdd_Kernel_Device<<<1, n>>>(dSamples_raw, dCounts_raw);

和看看会发生什么。

许可以下： CC-BY-SA 和归因

不隶属于 StackOverflow