Вопрос

I have a functor used by thrust, where I need to specify its length dynamically , like

struct func { 

       const int h;

       func(const int _h): h(_h) {}

       __device__ __host__
       void operator()(int id) {
              double data[h];
      }
};

I'm not sure how to do this, because h has to be a known number, but h is not known until run time.

Это было полезно?

Решение

The obvious way to solve this is use dynamic memory allocation, so the functor becomes

   __device__ __host__
   void operator()(int id) {
        double *data  = new double[h];

        // functor code goes here

        // Heap memory has context scope, so delete is necessary to stop leaks
        delete[] data; 
   };

This will work on GPUs of compute capability of 2.0 or newer. The downside is that memory allocation will be on the runtime heap in global memoey, which limits compiler optimisations, and the new/free operators themselves are very slow, so having this happen for each thread in the kernel launch will cost a lot of performance.

An alternative, if the value range of h is limited, consider replacing h within the operator code with a template parameter and then just use a selector instead for the known cases, so something like

   template<int j>
   __device__ __host__
   void guts(int id) {
       double data[j];
       // code here
   };

   __device__ __host__
   void guts_rt(int id) {
       double *data = new double[h];
       // code here
       delete[] data;
   };

   __device__ __host__
   void operator()(int id) {
       switch (h) {
           case 2:
           guts<2>(id);
           break;

           case 4:
           guts<4>(id);
           break;

           // As many as needed here

           default:
           guts_rt(id);
           break;
      }
  }

ie. try and use hard coded arrays where possible (which the compiler can optimize for), and fall back to a dynamic solution otherwise (and if your GPU actually supports dynamic allocation of heap memory anyway).

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top