SIMD Vectorization for the Lennard-Jones Potential with AVX2 and AVX-512 instructions

06/13/2018
by   Hiroshi Watanabe, et al.
0

This work describes the SIMD vectorization of the force calculation of the Lennard-Jones potential with Intel AVX2 and AVX-512 instruction sets. Since the force-calculation kernel of the molecular dynamics method involves indirect access to memory, the data layout is one of the most important factors in vectorization. We find that the Array of Structures (AoS) with padding exhibits better performance than Structure of Arrays (SoA) with appropriate vectorization and optimizations. In particular, AoS with 512-bit width exhibits the best performance among the architectures. While the difference in performance between AoS and SoA is significant for the vectorization with AVX2, that with AVX-512 is minor. The effect of other optimization techniques, such as software pipelining together with vectorization, is also discussed. We present results for benchmarks on three CPU architectures: Intel Haswell (HSW), Knights Landing (KNL), and Skylake (SKL). The performance gains by vectorization are about 42% on HSW compared with the code optimized without vectorization. On KNL, the hand-vectorized codes exhibit 34% better performance than the codes vectorized automatically by the Intel compiler. On SKL, the code vectorized with AVX2 exhibits slightly better performance than that with vectorized AVX-512.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/04/2019

Raising the Performance of the Tinker-HP Molecular Modeling Package on Intel's HPC Architectures: a Living Review [Article v1.0]

This living paper reviews the present High Performance Computing (HPC) c...
research
06/04/2019

Raising the Performance of the Tinker-HP Molecular Modeling Package [Article v1.0]

This living paper reviews the present High Performance Computing (HPC) c...
research
01/07/2022

A SIMD algorithm for the detection of epistatic interactions of any order

Epistasis is a phenomenon in which a phenotype outcome is determined by ...
research
10/02/2017

The Tersoff many-body potential: Sustainable performance through vectorization

Molecular dynamics models materials by simulating each individual partic...
research
09/07/2018

Optimizing CNN Model Inference on CPUs

The popularity of Convolutional Neural Network (CNN) models and the ubiq...
research
10/16/2018

Optimizing AIREBO: Navigating the Journey from Complex Legacy Code to High Performance

Despite initiatives to improve the quality of scientific codes, there st...

Please sign up or login with your details

Forgot password? Click here to reset