A Fast Knowledge Distillation Framework for Visual Recognition

12/02/2021
by   Zhiqiang Shen, et al.
33

While Knowledge Distillation (KD) has been recognized as a useful tool in many visual tasks, such as supervised classification and self-supervised representation learning, the main drawback of a vanilla KD framework is its mechanism, which consumes the majority of the computational overhead on forwarding through the giant teacher networks, making the entire learning procedure inefficient and costly. ReLabel, a recently proposed solution, suggests creating a label map for the entire image. During training, it receives the cropped region-level label by RoI aligning on a pre-generated entire label map, allowing for efficient supervision generation without having to pass through the teachers many times. However, as the KD teachers are from conventional multi-crop training, there are various mismatches between the global label-map and region-level label in this technique, resulting in performance deterioration. In this study, we present a Fast Knowledge Distillation (FKD) framework that replicates the distillation training phase and generates soft labels using the multi-crop KD approach, while training faster than ReLabel since no post-processes such as RoI align and softmax operations are used. When conducting multi-crop in the same image for data loading, our FKD is even more efficient than the traditional image classification framework. On ImageNet-1K, we obtain 79.8 outperforming ReLabel by  1.0 learning task, we also show that FKD has an efficiency advantage. Our project page: http://zhiqiangshen.com/projects/FKD/index.html, source code and models are available at: https://github.com/szq0214/FKD.

READ FULL TEXT

page 8

page 11

research
09/17/2020

MEAL V2: Boosting Vanilla ResNet-50 to 80 without Tricks

In this paper, we introduce a simple yet effective approach that can boo...
research
07/26/2022

Efficient One Pass Self-distillation with Zipf's Label Smoothing

Self-distillation exploits non-uniform soft supervision from itself duri...
research
04/01/2021

Is Label Smoothing Truly Incompatible with Knowledge Distillation: An Empirical Study

This work aims to empirically clarify a recently discovered perspective ...
research
05/23/2022

Boosting Multi-Label Image Classification with Complementary Parallel Self-Distillation

Multi-Label Image Classification (MLIC) approaches usually exploit label...
research
08/11/2022

MixSKD: Self-Knowledge Distillation from Mixup for Image Recognition

Unlike the conventional Knowledge Distillation (KD), Self-KD allows a ne...
research
03/04/2022

Better Supervisory Signals by Observing Learning Paths

Better-supervised models might have better performance. In this paper, w...
research
05/10/2021

KDExplainer: A Task-oriented Attention Model for Explaining Knowledge Distillation

Knowledge distillation (KD) has recently emerged as an efficacious schem...

Please sign up or login with your details

Forgot password? Click here to reset