Continual learning with direction-constrained optimization

11/25/2020
by   Yunfei Teng, et al.
6

This paper studies a new design of the optimization algorithm for training deep learning models with a fixed architecture of the classification network in a continual learning framework, where the training data is non-stationary and the non-stationarity is imposed by a sequence of distinct tasks. This setting implies the existence of a manifold of network parameters that correspond to good performance of the network on all tasks. Our algorithm is derived from the geometrical properties of this manifold. We first analyze a deep model trained on only one learning task in isolation and identify a region in network parameter space, where the model performance is close to the recovered optimum. We provide empirical evidence that this region resembles a cone that expands along the convergence direction. We study the principal directions of the trajectory of the optimizer after convergence and show that traveling along a few top principal directions can quickly bring the parameters outside the cone but this is not the case for the remaining directions. We argue that catastrophic forgetting in a continual learning setting can be alleviated when the parameters are constrained to stay within the intersection of the plausible cones of individual tasks that were so far encountered during training. Enforcing this is equivalent to preventing the parameters from moving along the top principal directions of convergence corresponding to the past tasks. For each task we introduce a new linear autoencoder to approximate its corresponding top forbidden principal directions. They are then incorporated into the loss function in the form of a regularization term for the purpose of learning the coming tasks without forgetting. We empirically demonstrate that our algorithm performs favorably compared to other state-of-art regularization-based continual learning methods, including EWC and SI.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/08/2019

Continual Learning by Asymmetric Loss Approximation with Single-Side Overestimation

Catastrophic forgetting is a critical challenge in training deep neural ...
research
06/16/2022

Continual Learning with Guarantees via Weight Interval Constraints

We introduce a new training paradigm that enforces interval constraints ...
research
06/12/2019

Task Agnostic Continual Learning via Meta Learning

While neural networks are powerful function approximators, they suffer f...
research
06/19/2020

SOLA: Continual Learning with Second-Order Loss Approximation

Neural networks have achieved remarkable success in many cognitive tasks...
research
05/28/2019

Uncertainty-based Continual Learning with Adaptive Regularization

We introduce a new regularization-based continual learning algorithm, du...
research
05/26/2022

Continual evaluation for lifelong learning: Identifying the stability gap

Introducing a time dependency on the data generating distribution has pr...
research
02/15/2023

À-la-carte Prompt Tuning (APT): Combining Distinct Data Via Composable Prompting

We introduce À-la-carte Prompt Tuning (APT), a transformer-based scheme ...

Please sign up or login with your details

Forgot password? Click here to reset