Solving the Poisson equation on multiple GPUs located on different cluster nodes interacting by the MPI protocol

StackOverflow https://stackoverflow.com/questions/18896314

  •  29-06-2022
  •  | 
  •  

Question

I'm trying to solve the Poisson equation in real space on a multi GPUs architecture using a code in C/CUDA with the MPI library. For the moment, I'm only interested in solving the problem in a periodic box. But in the future, I may want to look at spherical geometry.

Is there an existing routine to solve this problem ? Comments dated from August 2012 seem to indicate that the thrust library in not adapted for multi GPUs architectures. Is that still correct ?

If the routine exists, what method does it use (Jacobi, SOR, Gauss-Seidel, Krylov) ? Please express your opinion about its speed and the problems you may have encountered.

Thanks for your time.

Was it helpful?

Solution

Solving the Poisson equation by a Multi-GPU approach, with GPUs located on different cluster nodes interacting by using the MPI protocol, is a relatively recent research topic. The basic idea is to use domain decomposition, so that each GPU solves for one part of the computational domain, and MPI is used to exchange boundary data.

You may wish to have a look at the papers Towards a multi-GPU solver for the three-dimensional two-phase incompressible Navier-Stokes equations, presented at GTC 2012, and An MPI-CUDA Implementation for Massively Parallel Incompressible Flow Computations on Multi-GPU Clusters. Particularly in the first approach, Navier-Stokes equations are solved by Chorin’s projection approach which in turn requires the solution of a Poisson equation, which is the most demanding task and is solved by a MultiGPU/MPI strategy exploiting a Jacobi preconditioned conjugate gradient solver.

Concerning available routines, in the past I have bumped into GAMER, a downloadable software for astrophysics applications. The authors claim that the code contains a variety of GPU-accelerated Poisson solvers and hybrid OpenMP/MPI/GPU parallelization. However, I have never had the chance to download it.

OTHER TIPS

Thrust can be used in a multi GPU environment. You can use the runtime api i.e. cudaSetDevice, to switch devices. Since thrust handles allocations and deallocations for vectors implicitly, care must be taken to make sure that the correct device is selected when device vectors are declared and when they are deallocated i.e. go out of scope.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top