Weight Squeezing: Reparameterization for Compression and Fast Inference

10/14/2020
by   Artem Chumachenko, et al.
0

In this work, we present a novel approach for simultaneous knowledge transfer and model compression called Weight Squeezing. With this method, we perform knowledge transfer from a pre-trained teacher model by learning the mapping from its weights to smaller student model weights, without significant loss of model accuracy. We applied Weight Squeezing combined with Knowledge Distillation to a pre-trained text classification model, and compared it to various knowledge transfer and model compression methods on several downstream text classification tasks. We observed that our approach produces better results than Knowledge Distillation methods without any loss in inference speed. We also compared Weight Squeezing with Low Rank Factorization methods and observed that our method is significantly faster at inference while being competitive in terms of accuracy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/18/2019

Model Compression with Two-stage Multi-teacher Knowledge Distillation for Web Question Answering System

Deep pre-training and fine-tuning models (such as BERT and OpenAI GPT) h...
research
06/30/2023

Audio Embeddings as Teachers for Music Classification

Music classification has been one of the most popular tasks in the field...
research
09/21/2023

TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance

In this paper, we propose a novel cross-modal distillation method, calle...
research
08/03/2021

DeepFreeze: Cold Boot Attacks and High Fidelity Model Recovery on Commercial EdgeML Device

EdgeML accelerators like Intel Neural Compute Stick 2 (NCS) can enable e...
research
10/26/2019

Variational Student: Learning Compact and Sparser Networks in Knowledge Distillation Framework

The holy grail in deep neural network research is porting the memory- an...
research
04/08/2020

LadaBERT: Lightweight Adaptation of BERT through Hybrid Model Compression

BERT is a cutting-edge language representation model pre-trained by a la...
research
07/01/2021

Knowledge Distillation for Quality Estimation

Quality Estimation (QE) is the task of automatically predicting Machine ...

Please sign up or login with your details

Forgot password? Click here to reset