Adaptive Calibrator Ensemble for Model Calibration under Distribution Shift

by   Yuli Zou, et al.

Model calibration usually requires optimizing some parameters (e.g., temperature) w.r.t an objective function (e.g., negative log-likelihood). In this paper, we report a plain, important but often neglected fact that the objective function is influenced by calibration set difficulty, i.e., the ratio of the number of incorrectly classified samples to that of correctly classified samples. If a test set has a drastically different difficulty level from the calibration set, the optimal calibration parameters of the two datasets would be different. In other words, a calibrator optimal on the calibration set would be suboptimal on the OOD test set and thus has degraded performance. With this knowledge, we propose a simple and effective method named adaptive calibrator ensemble (ACE) to calibrate OOD datasets whose difficulty is usually higher than the calibration set. Specifically, two calibration functions are trained, one for in-distribution data (low difficulty), and the other for severely OOD data (high difficulty). To achieve desirable calibration on a new OOD dataset, ACE uses an adaptive weighting method that strikes a balance between the two extreme functions. When plugged in, ACE generally improves the performance of a few state-of-the-art calibration schemes on a series of OOD benchmarks. Importantly, such improvement does not come at the cost of the in-distribution calibration accuracy.


page 1

page 2

page 3

page 4


Dual-Branch Temperature Scaling Calibration for Long-Tailed Recognition

The calibration for deep neural networks is currently receiving widespre...

PEP: Parameter Ensembling by Perturbation

Ensembling is now recognized as an effective approach for increasing the...

One Eye is All You Need: Lightweight Ensembles for Gaze Estimation with Single Encoders

Gaze estimation has grown rapidly in accuracy in recent years. However, ...

eSSVI Surface Calibration

In this work I test two calibration algorithms for the eSSVI volatility ...

Sample-dependent Adaptive Temperature Scaling for Improved Calibration

It is now well known that neural networks can be wrong with high confide...

A comparison of linear and non-linear calibrations for speaker recognition

In recent work on both generative and discriminative score to log-likeli...

Joint calibration of Ensemble of Exemplar SVMs

We present a method for calibrating the Ensemble of Exemplar SVMs model....

Please sign up or login with your details

Forgot password? Click here to reset