Question

I was able to pass ordinary objects to kernel functions as a copy. However, when I added virtual inheritance to the class hierarchy, I got an error message saying that classes with user-defined copy constructor cannot be used as arguments for kernel launch. However, I don't have user-defined copy-constructor. So I guess it is because the internal implementation of virtual inheritance implemented some different type of copy-constructor. Can someone explain how is virtual inheritance actually implemented and is there any workaround, or is there absolutely no way to use virtual inheritance when writing cuda code?

The code is like this:

class Base {...};
class ChildA: public virtual Base {...};
class ChildB: public virtual Base {...};
class GrandChild: public ChildA, public ChildB {...};

__global__ void mykernel(Base x) {...}

int main() {
  GrandChild x;
  mykernel<<<1,1>>>(x);
  return 0;
}

EDIT: Here is my guess: I think nvcc allows only the default copy-constructor because in that case it can simply use a cudaMemcpyAsync to push the arguments into the call stack inside the device. So it hard-coded the compiler such that only default copy-constructors are allowed, yet objects with virtual inheritance internally has a different type of copy-constructor, which triggered the error in nvcc. However, I still believe there should be an easy way that nvcc could have allowed it, providing that nvcc supports virtual functions and other advanced C++ features.

Was it helpful?

Solution

Thanks for all the comments. According to the following two statements in the link @RobertCrovella provided, what I did is not allowed.

  1. It is not allowed to pass as an argument to a __global__ function an object of a class with virtual functions.
  2. It is not allowed to pass as an argument to a __global__ function an object of a class derived from virtual base classes.

Together with the link @Joky provided, I believe the reason is because nvcc uses a simple memory copy to push arguments from host memory into the call stack in the device memory. This is why no non-default copy-constructor is allowed, because only default copy-constructor is consistent with this simple memory copy behavior.

The solution I'm using is to break one inheritance relationship and setup a type conversion operator to preserve the missing link for upcasting, like the following. This works good for me because all these classes are just wrappers of pointers and type conversion is efficient enough, even though the price is that sometimes I have to do explicit type-conversion, like the case of calling mykernel2 function in the following.

class Base {...};
class ChildA: public Base {...};
class ChildB: public Base {...};
class GrandChild: public ChildA {
public:
  operator ChildB () {...}
};

__global__ void mykernel(Base x) {...}
__global__ void mykernel2(ChildB y) {...}

int main() {
  GrandChild x;
  mykernel<<<1,1>>>(x);
  mykernel2<<<1,1>>>(ChildB(x));
  return 0;
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top