Question

I am porting CUDA code to OpenCL - CUDA allows C++ constructs like templates while OpenCL is strictly C99. So, what is the most painless way of porting templatest to C? I thought of using function pointers for the template parameters.

Was it helpful?

Solution

Before there were templates, there were preprocessor macros.

Search the web for "generic programming in C" for inspiration.

OTHER TIPS

Here is the technique I used for conversion of some of CUDA algorithms from Modern GPU code to my GPGPU VexCL library (with OpenCL support).

Each template function in CUDA code is converted to two template functions in OpenCL host code. The first host function ('name' function) returns mangled name of the generated OpenCL function (so that functions with different template parameters have different names); the second host function ('source' function) returns the string representation of the generated OpenCL function source code. These functions are then used for generation of the main kernel code.

Take, for example, the CTAMergeSort CUDA function template. It gets converted to the two overloads of merge_sort function in VexCL code. I call the 'source' function in order to add the function definition to the OpenCL kernel source here and then use the 'name' function to add its call to the kernel here.

Note that the backend::source_generator in VexCL is used in order to generate either OpenCL or CUDA code transparently. In your case the code generation could be much simpler.

To make it all a bit more clear, here is the code that gets generated for the mergesort<256,11,int,float> template instance:

void mergesort_256_11_int_float
(
  int count,
  int tid,
  int * thread_keys0,
  local int * keys_shared0,
  float * thread_vals0,
  local float * vals_shared0
)
{
  if(11 * tid < count) odd_even_transpose_sort_11_int_float(thread_keys0, thread_vals0);
  thread_to_shared_11_int(thread_keys0, tid, keys_shared0);
  block_sort_loop_256_11_int_float(tid, count, keys_shared0, thread_vals0, vals_shared0);
}

Take a look at Boost.Compute. It provides a C++, STL-like API for OpenCL.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top