Knowledge Distillation for Efficient Sequences of Training Runs

03/11/2023
by   Xingyu Liu, et al.
0

In many practical scenarios – like hyperparameter search or continual retraining with new data – related training runs are performed many times in sequence. Current practice is to train each of these models independently from scratch. We study the problem of exploiting the computation invested in previous runs to reduce the cost of future runs using knowledge distillation (KD). We find that augmenting future runs with KD from previous runs dramatically reduces the time necessary to train these models, even taking into account the overhead of KD. We improve on these results with two strategies that reduce the overhead of KD by 80-90 vast pareto-improvements in overall cost. We conclude that KD is a promising avenue for reducing the cost of the expensive preparatory work that precedes training final models in practice.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/01/2018

On Compressing U-net Using Knowledge Distillation

We study the use of knowledge distillation to compress the U-net archite...
research
03/21/2017

Knowledge distillation using unlabeled mismatched images

Current approaches for Knowledge Distillation (KD) either directly use t...
research
06/25/2016

Sequence-Level Knowledge Distillation

Neural machine translation (NMT) offers a novel alternative formulation ...
research
09/15/2020

Autoregressive Knowledge Distillation through Imitation Learning

The performance of autoregressive models on natural language generation ...
research
09/07/2023

Towards Comparable Knowledge Distillation in Semantic Image Segmentation

Knowledge Distillation (KD) is one proposed solution to large model size...
research
01/20/2021

Deep Epidemiological Modeling by Black-box Knowledge Distillation: An Accurate Deep Learning Model for COVID-19

An accurate and efficient forecasting system is imperative to the preven...
research
10/18/2019

On the Difficulty of Warm-Starting Neural Network Training

In many real-world deployments of machine learning systems, data arrive ...

Please sign up or login with your details

Forgot password? Click here to reset