HPC Storage Service Autotuning Using Variational-Autoencoder-Guided Asynchronous Bayesian Optimization

10/03/2022
by   Matthieu Dorier, et al.
0

Distributed data storage services tailored to specific applications have grown popular in the high-performance computing (HPC) community as a way to address I/O and storage challenges. These services offer a variety of specific interfaces, semantics, and data representations. They also expose many tuning parameters, making it difficult for their users to find the best configuration for a given workload and platform. To address this issue, we develop a novel variational-autoencoder-guided asynchronous Bayesian optimization method to tune HPC storage service parameters. Our approach uses transfer learning to leverage prior tuning results and use a dynamically updated surrogate model to explore the large parameter search space in a systematic way. We implement our approach within the DeepHyper open-source framework, and apply it to the autotuning of a high-energy physics workflow on Argonne's Theta supercomputer. We show that our transfer-learning approach enables a more than 40× search speedup over random search, compared with a 2.5× to 10× speedup when not using transfer learning. Additionally, we show that our approach is on par with state-of-the-art autotuning frameworks in speed and outperforms them in resource utilization and parallelization capabilities.

READ FULL TEXT

page 1

page 9

research
07/01/2022

Asynchronous Distributed Bayesian Optimization at HPC Scale

Bayesian optimization (BO) is a widely used approach for computationally...
research
05/26/2021

The Petascale DTN Project: High Performance Data Transfer for HPC Facilities

The movement of large-scale (tens of Terabytes and larger) data sets bet...
research
08/12/2021

Scalable3-BO: Big Data meets HPC - A scalable asynchronous parallel high-dimensional Bayesian optimization framework on supercomputers

Bayesian optimization (BO) is a flexible and powerful framework that is ...
research
12/10/2017

Shape optimization in laminar flow with a label-guided variational autoencoder

Computational design optimization in fluid dynamics usually requires to ...
research
03/02/2022

Hyperparameter optimization of data-driven AI models on HPC systems

In the European Center of Excellence in Exascale computing "Research on ...
research
12/04/2021

Towards Aggregated Asynchronous Checkpointing

High-Performance Computing (HPC) applications need to checkpoint massive...

Please sign up or login with your details

Forgot password? Click here to reset