Memory Efficient Continual Learning for Neural Text Classification

03/09/2022
by   Beyza Ermis, et al.
0

Learning text classifiers based on pre-trained language models has become the standard practice in natural language processing applications. Unfortunately, training large neural language models, such as transformers, from scratch is very costly and requires a vast amount of training data, which might not be available in the application domain of interest. Moreover, in many real-world scenarios, classes are uncovered as more data is seen, calling for class-incremental modelling approaches. In this work we devise a method to perform text classification using pre-trained models on a sequence of classification tasks provided in sequence. We formalize the problem as a continual learning problem where the algorithm learns new tasks without performance degradation on the previous ones and without re-training the model from scratch. We empirically demonstrate that our method requires significantly less model parameters compared to other state of the art methods and that it is significantly faster at inference time. The tight control on the number of model parameters, and so the memory, is not only improving efficiency. It is making possible the usage of the algorithm in real-world applications where deploying a solution with a constantly increasing memory consumption is just unrealistic. While our method suffers little forgetting, it retains a predictive performance on-par with state of the art but less memory efficient methods.

READ FULL TEXT

page 14

page 15

research
06/28/2022

Continual Learning with Transformers for Image Classification

In many real-world scenarios, data to train machine learning models beco...
research
12/05/2021

Achieving Forgetting Prevention and Knowledge Transfer in Continual Learning

Continual learning (CL) learns a sequence of tasks incrementally with th...
research
12/09/2022

PIVOT: Prompting for Video Continual Learning

Modern machine learning pipelines are limited due to data availability, ...
research
05/24/2022

Continual-T0: Progressively Instructing 50+ Tasks to Language Models Without Forgetting

Recent work on large language models relies on the intuition that most n...
research
09/01/2022

Incremental Online Learning Algorithms Comparison for Gesture and Visual Smart Sensors

Tiny machine learning (TinyML) in IoT systems exploits MCUs as edge devi...
research
05/18/2020

Text Classification with Few Examples using Controlled Generalization

Training data for text classification is often limited in practice, espe...
research
03/27/2018

Fast Parametric Learning with Activation Memorization

Neural networks trained with backpropagation often struggle to identify ...

Please sign up or login with your details

Forgot password? Click here to reset