How one would implement a 2-for particle-interaction loop using CUDA, and what is the resulting complexity?

StackOverflow https://stackoverflow.com/questions/22394466

Question

This algorithm receives a world (list) of particles (3-dimensional vectors) and calls an interacting function between them. Or, in pseudocode:

function tick(world)
     for i in range(world)
        for j in range(world)
            world[i] = interact(world[i], world[j])

Where interact is a function that takes 2 particles and return another one, and could be anything, for example:

 function interact(a,b) = (a + b)*0.5

You can easily determine this algorithm is O(N^2) on the CPU. In my attempt to learn CUDA, I'm not sure how that could be implemented on the GPU. What would be the general structure of such algorithm, and what would be the resulting complexity? What if we knew the interact function didn't do anything if 2 particles were distant enough? Could we optimize it for locality?

Was it helpful?

Solution

What would be the general structure of such algorithm, and what would be the resulting complexity?

This is essentially the n-body problem. Solved using a direct particle-particle approach. It's been written about a lot. The order of the algorithm is O(N^2) on the GPU, just as it is on the CPU.

The core algorithm as implemented in CUDA doesn't change a lot except to take advantage of local block memory and optimize for it. Essentially the implementation would still come does to two loops.

The following paper is a good place to start, Chapter 31. Fast N-Body Simulation with CUDA.

Could we optimize it for locality?

Yes. Many n-body algorithms attempt to optimize for locality as gravitational and E-M forces decrease as a power of the distance between particles so distant particles can be ignored or their contribution can be approximated. Which of these approximation approaches to take largely depends on the type of system you are trying to simulate.

The following is a good overview of some of the more popular approaches, Seminar presentation, N-body algorithms

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top