How to train RNNs on chaotic data?

10/14/2021
by   Zahra Monfared, et al.
0

Recurrent neural networks (RNNs) are wide-spread machine learning tools for modeling sequential and time series data. They are notoriously hard to train because their loss gradients backpropagated in time tend to saturate or diverge during training. This is known as the exploding and vanishing gradient problem. Previous solutions to this issue either built on rather complicated, purpose-engineered architectures with gated memory buffers, or - more recently - imposed constraints that ensure convergence to a fixed point or restrict (the eigenspectrum of) the recurrence matrix. Such constraints, however, convey severe limitations on the expressivity of the RNN. Essential intrinsic dynamics such as multistability or chaos are disabled. This is inherently at disaccord with the chaotic nature of many, if not most, time series encountered in nature and society. Here we offer a comprehensive theoretical treatment of this problem by relating the loss gradients during RNN training to the Lyapunov spectrum of RNN-generated orbits. We mathematically prove that RNNs producing stable equilibrium or cyclic behavior have bounded gradients, whereas the gradients of RNNs with chaotic dynamics always diverge. Based on these analyses and insights, we offer an effective yet simple training technique for chaotic data and guidance on how to choose relevant hyperparameters according to the Lyapunov spectrum.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/09/2020

Scalable Neural Tangent Kernel of Recurrent Architectures

Kernels derived from deep neural networks (DNNs) in the infinite-width p...
research
11/01/2021

Reverse engineering recurrent neural networks with Jacobian switching linear dynamical systems

Recurrent neural networks (RNNs) are powerful models for processing time...
research
06/07/2023

Generalized Teacher Forcing for Learning Chaotic Dynamics

Chaotic dynamical systems (DS) are ubiquitous in nature and society. Oft...
research
07/29/2020

Theory of gating in recurrent neural networks

RNNs are popular dynamical models, used for processing sequential data. ...
research
06/25/2020

On Lyapunov Exponents for RNNs: Understanding Information Propagation Using Dynamical Systems Tools

Recurrent neural networks (RNNs) have been successfully applied to a var...
research
10/11/2022

On Scrambling Phenomena for Randomly Initialized Recurrent Networks

Recurrent Neural Networks (RNNs) frequently exhibit complicated dynamics...
research
05/31/2019

Updates of Equilibrium Prop Match Gradients of Backprop Through Time in an RNN with Static Input

Equilibrium Propagation (EP) is a biologically inspired learning algorit...

Please sign up or login with your details

Forgot password? Click here to reset