Consolidated learning – a domain-specific model-free optimization strategy with examples for XGBoost and MIMIC-IV

by   Katarzyna Woznica, et al.
Poznan University of Technology
Politechnika Warszawska

For many machine learning models, a choice of hyperparameters is a crucial step towards achieving high performance. Prevalent meta-learning approaches focus on obtaining good hyperparameters configurations with a limited computational budget for a completely new task based on the results obtained from the prior tasks. This paper proposes a new formulation of the tuning problem, called consolidated learning, more suited to practical challenges faced by model developers, in which a large number of predictive models are created on similar data sets. In such settings, we are interested in the total optimization time rather than tuning for a single task. We show that a carefully selected static portfolio of hyperparameters yields good results for anytime optimization, maintaining ease of use and implementation. Moreover, we point out how to construct such a portfolio for specific domains. The improvement in the optimization is possible due to more efficient transfer of hyperparameter configurations between similar tasks. We demonstrate the effectiveness of this approach through an empirical study for XGBoost algorithm and the collection of predictive tasks extracted from the MIMIC-IV medical database; however, consolidated learning is applicable in many others fields.


page 11

page 12


Towards Assessing the Impact of Bayesian Optimization's Own Hyperparameters

Bayesian Optimization (BO) is a common approach for hyperparameter optim...

An empirical study on hyperparameter tuning of decision trees

Machine learning algorithms often contain many hyperparameters whose val...

Experimental Investigation and Evaluation of Model-based Hyperparameter Optimization

Machine learning algorithms such as random forests or xgboost are gainin...

Hyperparameter Importance Across Datasets

With the advent of automated machine learning, automated hyperparameter ...

Quick-Tune: Quickly Learning Which Pretrained Model to Finetune and How

With the ever-increasing number of pretrained models, machine learning p...

Mimic-IV-ICD: A new benchmark for eXtreme MultiLabel Classification

Clinical notes are assigned ICD codes - sets of codes for diagnoses and ...

Please sign up or login with your details

Forgot password? Click here to reset