YAHPO Gym – Design Criteria and a new Multifidelity Benchmark for Hyperparameter Optimization

by   Florian Pfisterer, et al.
Universität München

When developing and analyzing new hyperparameter optimization (HPO) methods, it is vital to empirically evaluate and compare them on well-curated benchmark suites. In this work, we list desirable properties and requirements for such benchmarks and propose a new set of challenging and relevant multifidelity HPO benchmark problems motivated by these requirements. For this, we revisit the concept of surrogate-based benchmarks and empirically compare them to more widely-used tabular benchmarks, showing that the latter ones may induce bias in performance estimation and ranking of HPO methods. We present a new surrogate-based benchmark suite for multifidelity HPO methods consisting of 9 benchmark collections that constitute over 700 multifidelity HPO problems in total. All our benchmarks also allow for querying of multiple optimization targets, enabling the benchmarking of multi-objective HPO. We examine and compare our benchmark suite with respect to the defined requirements and show that our benchmarks provide viable additions to existing suites.


page 1

page 2

page 3

page 4


Efficient Benchmarking of Algorithm Configuration Procedures via Model-Based Surrogates

The optimization of algorithm (hyper-)parameters is crucial for achievin...

Why every GBDT speed benchmark is wrong

This article provides a comprehensive study of different ways to make sp...

Ten New Benchmarks for Optimization

Benchmarks are used for testing new optimization algorithms and their va...

FedHPO-B: A Benchmark Suite for Federated Hyperparameter Optimization

Hyperparameter optimization (HPO) is crucial for machine learning algori...

HPOBench: A Collection of Reproducible Multi-Fidelity Benchmark Problems for HPO

To achieve peak predictive performance, hyperparameter optimization (HPO...

Towards Realistic Optimization Benchmarks: A Questionnaire on the Properties of Real-World Problems

Benchmarks are a useful tool for empirical performance comparisons. Howe...

Please sign up or login with your details

Forgot password? Click here to reset