On the Reproducibility of Neural Network Predictions

02/05/2021
by   Srinadh Bhojanapalli, et al.
14

Standard training techniques for neural networks involve multiple sources of randomness, e.g., initialization, mini-batch ordering and in some cases data augmentation. Given that neural networks are heavily over-parameterized in practice, such randomness can cause churn – for the same input, disagreements between predictions of the two models independently trained by the same algorithm, contributing to the `reproducibility challenges' in modern machine learning. In this paper, we study this problem of churn, identify factors that cause it, and propose two simple means of mitigating it. We first demonstrate that churn is indeed an issue, even for standard image classification tasks (CIFAR and ImageNet), and study the role of the different sources of training randomness that cause churn. By analyzing the relationship between churn and prediction confidences, we pursue an approach with two components for churn reduction. First, we propose using minimum entropy regularizers to increase prediction confidences. Second, we present a novel variant of co-distillation approach <cit.> to increase model agreement and reduce churn. We present empirical results showing the effectiveness of both techniques in reducing churn while improving the accuracy of the underlying model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/19/2019

Post-synaptic potential regularization has potential

Improving generalization is one of the main challenges for training deep...
research
08/19/2022

Predicting Exotic Hadron Masses with Data Augmentation Using Multilayer Perceptron

Recently, there have been significant developments in neural networks; t...
research
02/09/2021

Locally Adaptive Label Smoothing for Predictive Churn

Training modern neural networks is an inherently noisy process that can ...
research
10/23/2019

Occlusions for Effective Data Augmentation in Image Classification

Deep networks for visual recognition are known to leverage "easy to reco...
research
04/04/2023

Calibrated Chaos: Variance Between Runs of Neural Network Training is Harmless and Inevitable

Typical neural network trainings have substantial variance in test-set p...
research
10/19/2020

Anti-Distillation: Improving reproducibility of deep networks

Deep networks have been revolutionary in improving performance of machin...
research
04/10/2022

Reducing Model Jitter: Stable Re-training of Semantic Parsers in Production Environments

Retraining modern deep learning systems can lead to variations in model ...

Please sign up or login with your details

Forgot password? Click here to reset