Sparse MoEs meet Efficient Ensembles

10/07/2021
by   James Urquhart Allingham, et al.
3

Machine learning models based on the aggregated outputs of submodels, either at the activation or prediction levels, lead to strong performance. We study the interplay of two popular classes of such models: ensembles of neural networks and sparse mixture of experts (sparse MoEs). First, we show that these two approaches have complementary features whose combination is beneficial. Then, we present partitioned batch ensembles, an efficient ensemble of sparse MoEs that takes the best of both classes of models. Extensive experiments on fine-tuned vision transformers demonstrate the accuracy, log-likelihood, few-shot learning, robustness, and uncertainty calibration improvements of our approach over several challenging baselines. Partitioned batch ensembles not only scale to models with up to 2.7B parameters, but also provide larger performance gains for larger models.

READ FULL TEXT
research
06/24/2020

Hyperparameter Ensembles for Robustness and Uncertainty Quantification

Ensembles over neural network weights trained from different random init...
research
03/06/2023

To Stay or Not to Stay in the Pre-train Basin: Insights on Ensembling in Transfer Learning

Transfer learning and ensembling are two popular techniques for improvin...
research
07/09/2021

Multi-headed Neural Ensemble Search

Ensembles of CNN models trained with different seeds (also known as Deep...
research
04/25/2023

Certifying Ensembles: A General Certification Theory with S-Lipschitzness

Improving and guaranteeing the robustness of deep learning models has be...
research
06/05/2018

Combining Multiple Algorithms in Classifier Ensembles using Generalized Mixture Functions

Classifier ensembles are pattern recognition structures composed of a se...
research
09/09/2022

Automatic Readability Assessment of German Sentences with Transformer Ensembles

Reliable methods for automatic readability assessment have the potential...
research
03/15/2023

Bayesian Quadrature for Neural Ensemble Search

Ensembling can improve the performance of Neural Networks, but existing ...

Please sign up or login with your details

Forgot password? Click here to reset