Linear Distillation Learning

06/13/2019
by   Arip Asadulaev, et al.
0

Deep Linear Networks do not have expressive power but they are mathematically tractable. In our work, we found an architecture in which they are expressive. This paper presents a Linear Distillation Learning (LDL) a simple remedy to improve the performance of linear networks through distillation. In deep learning models, distillation often allows the smaller/shallow network to mimic the larger models in a much more accurate way, while a network of the same size trained on the one-hot targets can't achieve comparable results to the cumbersome model. In our method, we train students to distill teacher separately for each class in dataset. The most striking result to emerge from the data is that neural networks without activation functions can achieve high classification score on a small amount of data on MNIST and Omniglot datasets. Due to tractability, linear networks can be used to explain some phenomena observed experimentally in deep non-linear networks. The suggested approach could become a simple and practical instrument while further studies in the field of linear networks and distillation are yet to be undertaken.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/13/2022

SimReg: Regression as a Simple Yet Effective Tool for Self-supervised Knowledge Distillation

Feature regression is a simple way to distill large neural network model...
research
01/26/2019

Progressive Label Distillation: Learning Input-Efficient Deep Neural Networks

Much of the focus in the area of knowledge distillation has been on dist...
research
04/17/2021

Data Distillation for Text Classification

Deep learning techniques have achieved great success in many fields, whi...
research
03/02/2023

Learning From Yourself: A Self-Distillation Method for Fake Speech Detection

In this paper, we propose a novel self-distillation method for fake spee...
research
03/17/2016

Do Deep Convolutional Nets Really Need to be Deep and Convolutional?

Yes, they do. This paper provides the first empirical demonstration that...
research
07/10/2020

Deep Contextual Clinical Prediction with Reverse Distillation

Healthcare providers are increasingly using learned methods to predict a...
research
11/09/2022

DeepE: a deep neural network for knowledge graph embedding

Recently, neural network based methods have shown their power in learnin...

Please sign up or login with your details

Forgot password? Click here to reset