Any arrays and memory objects that you use in an OpenCL kernel needed to be allocated via the OpenCL API (e.g. using clCreateBuffer
). This is because the host and device don't always share the same physical memory. A pointer to data that is allocated on the host (via malloc
) means absolutely nothing to a discrete GPU for example.
To pass an array of characters to an OpenCL kernel, you should write something along the lines of:
char *h_rp = (char*)malloc(length);
cl_mem d_rp = clCreateBuffer(context, CL_MEM_READ_WRITE | CL_MEM_COPY_HOST_PTR, length, h_rp, &err);
err = clSetKernelArg(ckKernel, 0, sizeof(cl_mem), &d_rp)
and declare the argument with the __global
(or __constant
) qualifier in your kernel. You can then copy the data back to the host with clEnqueueReadBuffer
.
If you do know that host and device share the same physical memory, then you can allocate memory that is visible to both host and device by creating a buffer with the CL_MEM_ALLOC_HOST_PTR
flag, and using clEnqueueMapMemObject
when you wish to access the data from the host. The new shared-virtual-memory (SVM) features of OpenCL 2.0 also improve the way that you can share buffers between host and device on unified-memory architectures.