Question

I'm writing a C++ library which does some computation on vectors of audio data.

The library supports both GPU (using Thrust, a C++ STL-like library for GPUs) and CPUs (using the STL). I'm using CUDA Toolkit 10.2, which is limited to GCC 8 (and thus limiting me to C++14). All of this is on an amd64 desktop computer running Fedora 32.

The library contain different classes, and each class has a CPU and GPU version. I'm looking for a neat way to define CPU/GPU variants without duplicating code. Sometimes when I fix a bug in the GPU algorithm, I forget to go and fix it in the CPU algorithm, and vice versa. Also, it would be nice if it could be something at the library level, so that if I instantiate "AlgorithmA-CPU", it internally uses "AlgorithmB-CPU", and similar for GPU.

Here's a simple example:

struct WindowCPU {
    std::vector<float> window{1.0, 2.0, 3.0};
}

struct WindowGPU {
    thrust::device_vector<float> window{1.0, 2.0, 3.0};
}

class AlgorithmCPU {
public:
    std::vector<float> scratch_buf;
    WindowCPU window;

    AlgorithmCPU(size_t size) : scratch_buf(size, 0.0F) {}

    void process_input(std::vector<float>& input) {
        // using thrust, the code ends up looking identical
        thrust::transform(input.begin(), input.end(), scratch_buf.begin(), some_functor());
    }
};

class AlgorithmGPU {
public:
    thrust::device_vector<float> scratch_buf;
    WindowGPU window;

    AlgorithmGPU(size_t size) : scratch_buf(size, 0.0F) {}

    void process_input(thrust::device_vector<float>& input) {
        // using thrust, the code ends up looking identical
        thrust::transform(input.begin(), input.end(), scratch_buf.begin(), some_functor());
    }
};

The example is overly simplified, but it shares the problem with all of my algorithms - the code is the same, except with different data types - CPU uses "std::vector", while GPU uses "thrust::device_vector". Also, there is a sort of "cascading" specialization - "AlgorithmCPU" uses "WindowCPU", and similar for GPU.

Here's one real example I have in my code currently, applied to the above fake algorithm, to reduce code duplication:

template <typename A>
static void _execute_algorithm_priv(A input, A output) {
        thrust::transform(input.begin(), input.end(), output.begin(), some_functor());
}

// GPU specialization
void AlgorithmGPU::process_input(thrust::device_vector<float>& input)
{
        _execute_algorithm_priv<thrust::device_vector<float>&>(
            input, scratch_buf);
}

// CPU specialization
void AlgorithmCPU::process_input(std::vector<float>& input)
{
        _execute_algorithm_priv<std::vector<float>&>(
            input, scratch_buf);
}

Now in the real code, I have many algorithms, some are huge. My imagination can't stretch to a global library-wide solution. I thought of something using an enum:

enum ComputeBackend {
    GPU,
    CPU
}

Afterwards, I would create templates of classes based on the enum - but I'd need to map the enum to different data types:

template <ComputeBackend b> class Algorithm {
// somehow define other types based on the compute backend

if (ComputeBackend b == ComputeBackend::CPU) {
    vector_type = std::vector<float>;
    other_type = Ipp32f;
} else {
    vector_type = thrust::device_vector<float>;
    other_type = Npp32f;
}
}

I read about "if static constexpr()" but I don't believe I can use it in C++14.

edit

Here's my solution based on the replies so far:

enum Backend {
        GPU,
        CPU
};

template<Backend T>
struct TypeTraits {};

template<>
struct TypeTraits<Backend::GPU> {
        typedef thrust::device_ptr<float> InputPointer;
        typedef thrust::device_vector<float> RealVector;
        typedef thrust::device_vector<thrust::complex<float>> ComplexVector;
};

template<>
struct TypeTraits<Backend::CPU> {
        typedef float* InputPointer;
        typedef std::vector<float> RealVector;
        typedef std::vector<thrust::complex<float>> ComplexVector;
};

template<Backend B> class Algorithm {

        typedef typename TypeTraits<B>::InputPointer InputPointer;
        typedef typename TypeTraits<B>::RealVector RealVector;
        typedef typename TypeTraits<B>::ComplexVector ComplexVector;

        public:
                RealVector scratch_buf;
                
                void process_input(InputPointer input);
};
Était-ce utile?

La solution

One possible solution is to use templates and move the CPU / GPU specific stuff into a traits class:

struct CPUBackendTraits {
    template <typename T>
    using vector_type = std::vector<T>;
};

struct GPUBackendTraits {
    template <typename T>
    using vector_type = thrust::device_vector<T>;
};

template <typename BackendTraits>
struct Window {
    typename BackendTraits::vector_type<float> window{1.0, 2.0, 3.0};
};

template <typename BackendTraits>
class Algorithm {
    typename BackendTraits::vector_type<float> scratch_buf;
    Window<BackendTraits> window;

    Algorithm(std::size_t size) : scratch_buf(size, 0.f) {}

    void process_input(typename BackendTraits::vector_type<float>& input) {
        thrust::transform(input.begin(), input.end(), scratch_buf.begin(), some_functor());
    }
};

The typename BackendTraits:: prefix can be annoying, but it can be ommitted by adding the according typedef or using statement to the class.

In some cases, you may not only want to use different types, but also call different code. This can be done, for example, by adding the code as function to the traits class. However, using function overloads can sometimes be less confusing:

void do_something(std::vector<float>& input) {
    // do something std::vector specific
}

void do_something(thrust::vector<float>& input) {
    // do something thrust::vector specific
}

template <typename BackendTraits>
class Algorithm {
    void do_something_backend_specific() {
        typename BackendTraits::vector_type<float> buf = ...;
        // Call either std::vector or thrust::vector overload:
        do_something(buf);
    }
}

There are some advantages over an enum with conditionals:

  • Template programming allows you to use different types.
  • No need for if-else statements that choose the backend. Just pass the traits class around and everything automatically uses the same backend.
  • Adding new backends is simple, just add a new traits class.

Of course, there also some disadvantages:

  • Reading and writing template code can be harder than reading and writing code with concrete types.
  • All code is in templates, so if you want to use .cpp files for that code, you must add explicit template instantiations.
Licencié sous: CC-BY-SA avec attribution
scroll top