Warming-up recurrent neural networks to maximize reachable multi-stability greatly improves learning

06/02/2021
by   Nicolas Vecoven, et al.
0

Training recurrent neural networks is known to be difficult when time dependencies become long. Consequently, training standard gated cells such as gated recurrent units and long-short term memory on benchmarks where long-term memory is required remains an arduous task. In this work, we propose a general way to initialize any recurrent network connectivity through a process called "warm-up" to improve its capability to learn arbitrarily long time dependencies. This initialization process is designed to maximize network reachable multi-stability, i.e. the number of attractors within the network that can be reached through relevant input trajectories. Warming-up is performed before training, using stochastic gradient descent on a specifically designed loss. We show that warming-up greatly improves recurrent neural network performance on long-term memory benchmarks for multiple recurrent cell types, but can sometimes impede precision. We therefore introduce a parallel recurrent network structure with partial warm-up that is shown to greatly improve learning on long time-series while maintaining high levels of precision. This approach provides a general framework for improving learning abilities of any recurrent cell type when long-term memory is required.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/22/2023

Recurrent Neural Networks and Long Short-Term Memory Networks: Tutorial and Survey

This is a tutorial paper on Recurrent Neural Network (RNN), Long Short-T...
research
03/23/2018

Can recurrent neural networks warp time?

Successful recurrent models such as long short-term memories (LSTMs) and...
research
06/09/2020

A bio-inspired bistable recurrent cell allows for long-lasting memory

Recurrent neural networks (RNNs) provide state-of-the-art performances i...
research
11/05/2020

Short-Term Memory Optimization in Recurrent Neural Networks by Autoencoder-based Initialization

Training RNNs to learn long-term dependencies is difficult due to vanish...
research
06/20/2019

The trade-off between long-term memory and smoothness for recurrent networks

Training recurrent neural networks (RNNs) that possess long-term memory ...
research
06/23/2018

Deductron - A Recurrent Neural Network

The current paper is a study in Recurrent Neural Networks (RNN), motivat...
research
03/07/2017

Linguistic Knowledge as Memory for Recurrent Neural Networks

Training recurrent neural networks to model long term dependencies is di...

Please sign up or login with your details

Forgot password? Click here to reset