What you must keep in mind is that cusp is a template library built on top of thrust which is also a template library. Both cusp and thrust class methods contain a lot of device code, so if you inherit from them, you inherit a lot of device code as well.
This means that however you choose to structure the code, in "classic" CUDA compilation, the point of instantiation of your class and all the included cusp and thrust template code must be in the same translation unit and must be presented to nvcc in a form it will recognize as requiring device code compilation (ie. in a .cu file or with an appropriate compiler switch).
So this sequence of defining a derived class and instantiating it:
class CuspMatrix
: public cusp::csr_matrix<int,float,cusp::device_memory>
{
...
}
...
{
CuspMatrix A;
...
}
must be compiled with nvcc.