OpenCL version of cudaMemcpyToSymbol & optimization

https://stackoverflow.com/questions/10413051

05-06-2021
|

Question

Can someone tell me OpenCl version of cudaMemcpyToSymbol for copying __constant to device and getting back to host?
Or usual clenquewritebuffer(...) will do the job ?
Could not find much help in forum. Actually a few lines of demo will suffice.

Also shall I expect same kind of optimization in opencl as that of CUDA using constant cache?

Thanks

Solution

Not sure about OpenCL.Net, but in plain OpenCL: yes, clenquewritebuffer is enough (just remember to create buffer with CL_MEM_READ_ONLY flag set).

Here is a demo from Nvidia GPU Computing SDK (OpenCL/src/oclQuasirandomGenerator/oclQuasirandomGenerator.cpp):

c_Table[i] = clCreateBuffer(cxGPUContext, CL_MEM_READ_ONLY, QRNG_DIMENSIONS * QRNG_RESOLUTION * sizeof(unsigned int),     
                 NULL, &ciErr);
ciErr |= clEnqueueWriteBuffer(cqCommandQueue[i], c_Table[i], CL_TRUE, 0, 
            QRNG_DIMENSIONS * QRNG_RESOLUTION * sizeof(unsigned int), tableCPU, 0, NULL,  NULL);

Constant memory in CUDA and in OpenCL are exactly the same, and provide the same type of optimization. That is, if you use nVidia GPU. On ATI GPUs, it should act similarly. And I doubt that constant memory would give you any benefit over global when run on CPU.

OTHER TIPS

I have seen people use cudaMemcpyToSymbol() for setting up constants in the kernel and the compiler could take advantage of those constants when optimizing the code. If one was to setup a memory buffer in openCL to pass such constants to the kernel then the compiler could not use them to optimize the code.

Instead the solution I found is to replace the cudaMemcpyToSymbol() with a print to a string that defines the symbol for the compiler. The compiler can take definitions in the form of -D FOO=bar for setting the symbol FOO to the value bar.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow