Collaborative Teacher-Student Learning via Multiple Knowledge Transfer

01/21/2021
by   Liyuan Sun, et al.
4

Knowledge distillation (KD), as an efficient and effective model compression technique, has been receiving considerable attention in deep learning. The key to its success is to transfer knowledge from a large teacher network to a small student one. However, most of the existing knowledge distillation methods consider only one type of knowledge learned from either instance features or instance relations via a specific distillation strategy in teacher-student learning. There are few works that explore the idea of transferring different types of knowledge with different distillation strategies in a unified framework. Moreover, the frequently used offline distillation suffers from a limited learning capacity due to the fixed teacher-student architecture. In this paper we propose a collaborative teacher-student learning via multiple knowledge transfer (CTSL-MKT) that prompts both self-learning and collaborative learning. It allows multiple students learn knowledge from both individual instances and instance relations in a collaborative way. While learning from themselves with self-distillation, they can also guide each other via online distillation. The experiments and ablation studies on four image datasets demonstrate that the proposed CTSL-MKT significantly outperforms the state-of-the-art KD methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 9

page 10

page 11

page 12

research
11/23/2021

Semi-Online Knowledge Distillation

Knowledge distillation is an effective and stable method for model compr...
research
03/09/2020

Knowledge distillation via adaptive instance normalization

This paper addresses the problem of model compression via knowledge dist...
research
07/03/2020

Knowledge Distillation Beyond Model Compression

Knowledge distillation (KD) is commonly deemed as an effective model com...
research
12/31/2019

Modeling Teacher-Student Techniques in Deep Neural Networks for Knowledge Distillation

Knowledge distillation (KD) is a new method for transferring knowledge o...
research
12/07/2021

ADD: Frequency Attention and Multi-View based Knowledge Distillation to Detect Low-Quality Compressed Deepfake Images

Despite significant advancements of deep learning-based forgery detector...
research
09/16/2020

Collaborative Group Learning

Collaborative learning has successfully applied knowledge transfer to gu...
research
11/30/2022

Hint-dynamic Knowledge Distillation

Knowledge Distillation (KD) transfers the knowledge from a high-capacity...

Please sign up or login with your details

Forgot password? Click here to reset