Gadget3 on GPUs with OpenACC

03/24/2020
by   Antonio Ragagnin, et al.
0

We present preliminary results of a GPU porting of all main Gadget3 modules (gravity computation, SPH density computation, SPH hydrodynamic force, and thermal conduction) using OpenACC directives. Here we assign one GPU to each MPI rank and exploit both the host and accellerator capabilities by overlapping computations on the CPUs and GPUs: while GPUs asynchronously compute interactions between particles within their MPI ranks, CPUs perform tree-walks and MPI communications of neighbouring particles. We profile various portions of the code to understand the origin of our speedup, where we find that a peak speedup is not achieved because of time-steps with few active particles. We run a hydrodynamic cosmological simulation from the Magneticum project, with 2·10^7 particles, where we find a final total speedup of ≈ 2. We also present the results of an encouraging scaling test of a preliminary gravity-only OpenACC porting, run in the context of the EuroHack17 event, where the prototype of the porting proved to keep a constant speedup up to 1024 GPUs.

READ FULL TEXT

page 2

page 7

research
11/06/2022

Multi-GPU thermal lattice Boltzmann simulations using OpenACC and MPI

We assess the performance of the hybrid Open Accelerator (OpenACC) and M...
research
10/23/2018

Exploiting the Space Filling Curve Ordering of Particles in the Neighbour Search of Gadget3

Gadget3 is nowadays one of the most frequently used high performing para...
research
12/13/2020

A GPU-Accelerated Fast Summation Method Based on Barycentric Lagrange Interpolation and Dual Tree Traversal

We present the barycentric Lagrange dual tree traversal (BLDTT) fast sum...
research
11/01/2021

Principles towards Real-Time Simulation of Material Point Method on Modern GPUs

Physics-based simulation has been actively employed in generating offlin...
research
03/03/2020

A GPU-Accelerated Barycentric Lagrange Treecode

We present an MPI + OpenACC implementation of the kernel-independent bar...
research
12/05/2020

An Improved Framework of GPU Computing for CFD Applications on Structured Grids using OpenACC

This paper is focused on improving multi-GPU performance of a research C...
research
08/01/2023

The MPI + CUDA Gaia AVU-GSR Parallel Solver Toward Next-generation Exascale Infrastructures

We ported to the GPU with CUDA the Astrometric Verification Unit-Global ...

Please sign up or login with your details

Forgot password? Click here to reset