Monitor progress in OpenCL kernel

Question 1

No, there is no way to know how many Work-Groups/Items have completed the execution.

If you need a progress bar is probably because it is either VERY slow, or you have a lot of data to process. If your OpenCL app is very slow I would suggest you to optimize it, since it shouldn't take more than 1 second to complete a call.

However, if you have a lot of data to process, then you can split up the work in small chunks. Then you can track the completion of these chunks.

Question 2

A not-so-robust solution is to use a host_ptr variable, using the CL_MEM_ALLOC_HOST_PTR flag, let the kernel to modify the pointer, then on the host side, use a while loop to test this pointer and use the value to print progress bar,

here is the declaration https://github.com/fangq/mcxcl/blob/master/src/mcx_host.cpp#L428-L431

here is the update inside the kernel https://github.com/fangq/mcxcl/blob/master/src/mcx_core.cl#L845-L848

here is the host-side value retrieval and progress bar printing https://github.com/fangq/mcxcl/blob/master/src/mcx_host.cpp#L583-L606

this works ok on AMD GPUs (the update is somewhat sparse, the progress variable is only updated a few times during the kernel runtime, causing non-even jump in the progress bar). however, for nvidia and intel devices, this does not print anything until the kernel is complete.

try my code here

git clone https://github.com/fangq/mcxcl.git
cd mcxcl/src
make clean all
cd ../example/quicktest
./run_qtest.sh -D P

I asked this question on NVIDIA's forum, but no one knows how to fix it for nvidia.

https://devtalk.nvidia.com/default/topic/1031335/cuda-programming-and-performance/how-to-update-host-memory-variable-from-device-to-host-during-kernel-execution-in-opencl/