OpenMP to CUDA: Reduction

https://stackoverflow.com/questions/13793097

for-loop
cuda
openmp
reduction

06-12-2021
|

Question

I'm trying to figure out how I can use OpenMP's for reduction() equivalent in CUDA. I've done some research online, and none of what I've tried worked. The code:

    #pragma omp parallel for reduction(+:sum)
    for (i = 0; i < N; i++)
    {
        float f = ...  //store return from function to f
        out[i] = f;    //store f to out[i]
        sum += f;      //add f to sum and store in sum
    }

I know what for reduction() does in OpenMP....it makes the last line of the for loop possible. But how can I use CUDA to express the same thing?

Thanks!

La solution

Use Thrust, An STL inspired library that comes with CUDA. See the Quick Start Guide for examples on how to perform reductions.

Licencié sous: CC-BY-SA avec attribution

Non affilié à StackOverflow