Question

im building and project which uses both Thrust (cuda api) and openMP technologies.the main purpose of my program is to present an interface to calculate something , simultaneously speaking. in order to do that i've decided to use the STRATEGY design pattern , which basically means that we need to define a base class with a virtual function , and then other classes to derive from that base class and implement the needed function.

my problem starts here : 1 . can my project has more than 1 .CU file? 2 . can CU files have decleration of classes?

  class foo 
        {
            int m_name;
            void doSomething();
        }

3. this one continues 2. , i've head that DEVICE kernels can not be declared inside classes and has to be done like this :

//header file
__DEVICE__ void kernel(int x, inty)
{.....
}

class a : foo
{
   void doSomething();
}

//cu file

void a::doSomething()
{
  kernel<<<1,1>>>......();
}

is it the right way? 4.last question is , we i use THRUST , must i use CU files as well?

Thanks , igal

Was it helpful?

Solution

  1. Yes, you can use multiple .cu files in your project.
  2. Yes, but there are restrictions. According to *CUDA_C_Programming_Guide* v.4.0, section 3.1.5:

    The front end of the compiler processes CUDA source files according to C++ syntax rules. Full C++ is supported for the host code. However, only a subset of C++ is fully supported for the device code as described in Appendix D. As a consequence of the use of C++ syntax rules, void pointers (e.g., returned by malloc()) cannot be assigned to non-void pointers without a typecast.

  3. You're ALMOST correct. You have to use __global__ keyword when declaring your kernel.

    __global__ void kernel(int x, inty)
    {.....
    }
    
  4. Well, yes. Actually your thrust-boosted device code should be compiled with nvcc. See thrust documentation for details.

In general, you will compile your programs like that:

    $ nvcc -c device.cu
    $ g++  -c host.cpp   -I/usr/local/cuda/include/
    $ nvcc device.o host.o

Alternatively, you can use g++ to perform final linking step.

    $ g++ tester device.o host.o -L/usr/local/cuda/lib64 -lcudart

On Windows change the paths after -I and -L. Also, as far as I know, you have to use cl compiler (MS Visual Studio).

Note 1: Watch out for x86/x64 compatibility: if you use 64-bit CUDA Toolkit, use also a 64-bit compiler. (check -m32 and -m64 options of nvcc also)

Note 2: device.cu contains kernels and a function that invokes kernel(s). This function has to be annotated with extern "C". It can contain classes (limitations apply). host.cpp contains pure C++ code with a extern "C" declaration of the function that is in device.cu (NOT kernel).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top