Continual Learning of Neural Machine Translation within Low Forgetting Risk Regions

11/03/2022
by   Shuhao Gu, et al.
0

This paper considers continual learning of large-scale pretrained neural machine translation model without accessing the previous training data or introducing model separation. We argue that the widely used regularization-based methods, which perform multi-objective learning with an auxiliary loss, suffer from the misestimate problem and cannot always achieve a good balance between the previous and new tasks. To solve the problem, we propose a two-stage training method based on the local features of the real loss. We first search low forgetting risk regions, where the model can retain the performance on the previous task as the parameters are updated, to avoid the catastrophic forgetting problem. Then we can continually train the model within this region only with the new training data to fit the new task. Specifically, we propose two methods to search the low forgetting risk regions, which are based on the curvature of loss and the impacts of the parameters on the model output, respectively. We conduct experiments on domain adaptation and more challenging language adaptation tasks, and the experimental results show that our method can achieve significant improvements compared with several strong baselines.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/25/2021

Pruning-then-Expanding Model for Domain Adaptation of Neural Machine Translation

Domain Adaptation is widely used in practical applications of neural mac...
research
11/02/2020

Investigating Catastrophic Forgetting During Continual Training for Neural Machine Translation

Neural machine translation (NMT) models usually suffer from catastrophic...
research
03/08/2022

Overcoming Catastrophic Forgetting beyond Continual Learning: Balanced Training for Neural Machine Translation

Neural networks tend to gradually forget the previously learned knowledg...
research
05/10/2021

Continual Learning via Bit-Level Information Preserving

Continual learning tackles the setting of learning different tasks seque...
research
08/30/2023

Introducing Language Guidance in Prompt-based Continual Learning

Continual Learning aims to learn a single model on a sequence of tasks w...
research
03/17/2023

Fixed Design Analysis of Regularization-Based Continual Learning

We consider a continual learning (CL) problem with two linear regression...
research
04/11/2023

Task Difficulty Aware Parameter Allocation Regularization for Lifelong Learning

Parameter regularization or allocation methods are effective in overcomi...

Please sign up or login with your details

Forgot password? Click here to reset