Question

Thrust automatically selects the GPU backend when I provide an algorithm with iterators from thrust::device_vector, since the vector's data lives on the GPU. However, when I only provide thrust::counting_iterator parameters to an algorithm, how can I select which backend it executes on?

In the following invocation of thrust::find, there are no device_vector iterator arguments, so how does Thrust choose which backend (CPU, OMP, TBB, CUDA) to use?

How can I control on which backend this algorithm executes without using thrust::device_vector<> in this code?

thrust::counting_iterator<uint64_t> first(i);
thrust::counting_iterator<uint64_t> last = first + step_size;

auto iter = thrust::find( 
            thrust::make_transform_iterator(first, functor),
            thrust::make_transform_iterator(last, functor),
            true);

UPDATE 23.01.14. MSVS2012, CUDA5.5, Thrust 1.7:

Compile success!

#include <iostream>
#include <thrust/iterator/counting_iterator.h>
#include <thrust/iterator/transform_iterator.h>
#include <thrust/find.h>
#include <thrust/functional.h>

#include <thrust/execution_policy.h>

struct is_odd : public thrust::unary_function<uint64_t, bool> {
  __host__ __device__ bool operator()(uint64_t const& x) {
    return x & 1;
  }
};

int main() {
    thrust::counting_iterator<uint64_t> first(0);
    thrust::counting_iterator<uint64_t> last = first + 100;

    auto iter = thrust::find(thrust::device,
                thrust::make_transform_iterator(first, is_odd()),
                thrust::make_transform_iterator(last, is_odd()),
                true);

    int bbb; std::cin >> bbb;
    return 0;
}
Was it helpful?

Solution

Sometimes where a Thrust algorithm executes can be ambiguous, as in your counting_iterator example, because its associated "backend system" is thrust::any_system_tag (a counting_iterator can be dereferenced anywhere because it is not backed by data). In situations like this, Thrust will use the device backend. By default, this will be CUDA. However, you can explicitly control how execution happens in a couple of ways.

You can either explicitly specify the system through the template parameter as in ngimel's answer, or you can provide the thrust::device execution policy as the first argument to thrust::find in your example:

#include <thrust/execution_policy.h>
...
thrust::counting_iterator<uint64_t> first(i);
thrust::counting_iterator<uint64_t> last = first + step_size;

auto iter = thrust::find(thrust::device,
                         thrust::make_transform_iterator(first, functor),
                         thrust::make_transform_iterator(last, functor),
                         true);

This technique requires Thrust 1.7 or better.

OTHER TIPS

You have to specify System template parameter when instantiating counting_iterator:

 typedef thrust::device_system_tag  System;
 thrust::counting_iterator<uint64_t,System> first(i)

If you are using the current version of Thrust, please follow the way Jared Hoberock mentioned. But if you might use older versions (the system that you work at might have old version of CUDA) then the example below might help.

#include <thrust/version.h>

#if THRUST_MINOR_VERSION > 6
    #include <thrust/execution_policy.h>
#elif THRUST_MINOR_VERSION == 6
    #include <thrust/iterator/retag.h>
#else
#endif

...

#if THRUST_MINOR_VERSION > 6
  total = 
    thrust::transform_reduce(
      thrust::host
      , thrust::counting_iterator<unsigned int>(0)
      , thrust::counting_iterator<unsigned int>(N)
      , AFunctor(), 0, thrust::plus<unsigned int>());
#elif THRUST_MINOR_VERSION == 6
  total = 
    thrust::transform_reduce(
      thrust::retag<thrust::host_system_tag>(thrust::counting_iterator<unsigned int>(0)) 
      , thrust::retag<thrust::host_system_tag>(thrust::counting_iterator<unsigned int>(N))
      , AFunctor(), 0, thrust::plus<unsigned int>());
#else
  total = 
    thrust::transform_reduce(
      thrust::counting_iterator<unsigned int, thrust::host_space_tag>(0)
      , thrust::counting_iterator<unsigned int, thrust::host_space_tag>(objectCount)
      , AFunctor(), 0, thrust::plus<unsigned int>());
#endif

@see Thrust: How to directly control where an algorithm invocation executes?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top