Maintaining Plasticity in Deep Continual Learning

by   Shibhansh Dohare, et al.
University of Alberta

Modern deep-learning systems are specialized to problem settings in which training occurs once and then never again, as opposed to continual-learning settings in which training occurs continually. If deep-learning systems are applied in a continual learning setting, then it is well known that they may fail catastrophically to remember earlier examples. More fundamental, but less well known, is that they may also lose their ability to adapt to new data, a phenomenon called loss of plasticity. We show loss of plasticity using the MNIST and ImageNet datasets repurposed for continual learning as sequences of tasks. In ImageNet, binary classification performance dropped from 89 correct on an early task down to 77 network, on the 2000th task. Such loss of plasticity occurred with a wide range of deep network architectures, optimizers, and activation functions, and was not eased by batch normalization or dropout. In our experiments, loss of plasticity was correlated with the proliferation of dead units, with very large weights, and more generally with a loss of unit diversity. Loss of plasticity was substantially eased by L^2-regularization, particularly when combined with weight perturbation (Shrink and Perturb). We show that plasticity can be fully maintained by a new algorithm – called continual backpropagation – which is just like conventional backpropagation except that a small fraction of less-used units are reinitialized after each example.


Meta-Consolidation for Continual Learning

The ability to continuously learn and adapt itself to new tasks, without...

Drinking from a Firehose: Continual Learning with Web-scale Natural Language

Continual learning systems will interact with humans, with each other, a...

Continual Learning with Extended Kronecker-factored Approximate Curvature

We propose a quadratic penalty method for continual learning of neural n...

One Pass ImageNet

We present the One Pass ImageNet (OPIN) problem, which aims to study the...

A Study on Efficiency in Continual Learning Inspired by Human Learning

Humans are efficient continual learning systems; we continually learn ne...

Continual General Chunking Problem and SyncMap

Humans possess an inherent ability to chunk sequences into their constit...

Batch Model Consolidation: A Multi-Task Model Consolidation Framework

In Continual Learning (CL), a model is required to learn a stream of tas...

Please sign up or login with your details

Forgot password? Click here to reset