Teacher-Student Architecture for Knowledge Learning: A Survey

by   Chengming Hu, et al.

Although Deep Neural Networks (DNNs) have shown a strong capacity to solve large-scale problems in many areas, such DNNs with voluminous parameters are hard to be deployed in a real-time system. To tackle this issue, Teacher-Student architectures were first utilized in knowledge distillation, where simple student networks can achieve comparable performance to deep teacher networks. Recently, Teacher-Student architectures have been effectively and widely embraced on various knowledge learning objectives, including knowledge distillation, knowledge expansion, knowledge adaption, and multi-task learning. With the help of Teacher-Student architectures, current studies are able to achieve multiple knowledge-learning objectives through lightweight and effective student networks. Different from the existing knowledge distillation surveys, this survey detailedly discusses Teacher-Student architectures with multiple knowledge learning objectives. In addition, we systematically introduce the knowledge construction and optimization process during the knowledge learning and then analyze various Teacher-Student architectures and effective learning schemes that have been leveraged to learn representative and robust knowledge. This paper also summarizes the latest applications of Teacher-Student architectures based on different purposes (i.e., classification, recognition, and generation). Finally, the potential research directions of knowledge learning are investigated on the Teacher-Student architecture design, the quality of knowledge, and the theoretical studies of regression-based learning, respectively. With this comprehensive survey, both industry practitioners and the academic community can learn insightful guidelines about Teacher-Student architectures on multiple knowledge learning objectives.


Teacher-Student Architecture for Knowledge Distillation: A Survey

Although Deep neural networks (DNNs) have shown a strong capacity to sol...

Adaptive Multi-Teacher Multi-level Knowledge Distillation

Knowledge distillation (KD) is an effective learning paradigm for improv...

Categories of Response-Based, Feature-Based, and Relation-Based Knowledge Distillation

Deep neural networks have achieved remarkable performance for artificial...

Spirit Distillation: Precise Real-time Prediction with Insufficient Data

Recent trend demonstrates the effectiveness of deep neural networks (DNN...

Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks

Deep neural models in recent years have been successful in almost every ...

Robust Student Network Learning

Deep neural networks bring in impressive accuracy in various application...

An Efficient Federated Distillation Learning System for Multi-task Time Series Classification

This paper proposes an efficient federated distillation learning system ...

Please sign up or login with your details

Forgot password? Click here to reset