سؤال

My algorithm consists from two steps:

  1. Data generation. On this step I generate data array in cycle as some function result
  2. Data processing. For this step I written OpenCL kernel which process data array generated on previous step.

Now first step runs on CPU because it hard to parallelize. I want to run it on GPU because each step of generation takes some time. And I want to run second step for already generated data immediately.

Can I run another opencl kernel from currently runned kernel in separated thread? Or it be run in the some thread that caller kernel?

Some pseudocode for illustrate my point:

__kernel second(__global int * data, int index) {
    //work on data[i]. This process takes a lot of time
}

__kernel first(__global int * data, const int length) {
    for (int i = 0; i < length; i++) {
        // generate data and store it in data[i]

        // This kernel will be launched in some thread that caller or in new thread?
        // If in same thread, there are ways to launch it in separated thread?
        second(data, i);
    }
}
هل كانت مفيدة؟

المحلول

No, OpenCL has no concept of threads, and neither a kernel execution can launch another kernel. All kernel execution is triggered by the CPU.

نصائح أخرى

You should launch one kernel. Then do a clFInish(); Then execute the next kernel.

There are more efficient ways but I will only mess you with events.

You just use the memory output of the first kernel as input for the second one. With that, you aboid CPU->GPU copy process.

I believe that the global work size might be considered as the number of threads that will be executed, in one way or another. Correct me if I'm wrong.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top