Benefits of Depth for Long-Term Memory of Recurrent Networks

10/25/2017
by   Yoav Levine, et al.
0

The key attribute that drives the unprecedented success of modern Recurrent Neural Networks (RNNs) on learning tasks which involve sequential data, is their ever-improving ability to model intricate long-term temporal dependencies. However, an adequate measure of RNNs long-term memory capacity is lacking, and thus formal understanding of their ability to correlate data throughout time is limited. Though depth efficiency in convolutional networks is well established, it does not suffice in order to account for the success of deep RNNs on data of varying lengths, and the need to address their `time-series expressive power' arises. In this paper, we analyze the effect of depth on the ability of recurrent networks to express correlations ranging over long time-scales. To meet the above need, we introduce a measure of the information flow across time supported by the network, referred to as the Start-End separation rank. This measure essentially reflects the distance of the function realized by the recurrent network from a function that models no interaction whatsoever between the beginning and end of the input sequence. We prove that deep recurrent networks support Start-End separation ranks which are exponentially higher than those supported by their shallow counterparts. Thus, we establish that depth brings forth an overwhelming advantage in the ability of recurrent networks to model long-term dependencies. Such analyses may be readily extended to other RNN architectures of interest, e.g. variants of LSTM networks. We obtain our results by considering a class of recurrent networks referred to as Recurrent Arithmetic Circuits (RACs), which merge the hidden state with the input via the Multiplicative Integration operation. Finally, we make use of the tool of quantum Tensor Networks to gain additional graphic insight regarding the complexity brought forth by depth in recurrent networks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/23/2020

Depth Enables Long-Term Memory for Recurrent Neural Networks

A key attribute that drives the unprecedented success of modern Recurren...
research
06/08/2017

Gated Orthogonal Recurrent Units: On Learning to Forget

We present a novel recurrent neural network (RNN) based model that combi...
research
06/08/2020

Learning Long-Term Dependencies in Irregularly-Sampled Time Series

Recurrent neural networks (RNNs) with continuous-time hidden states are ...
research
06/20/2019

The trade-off between long-term memory and smoothness for recurrent networks

Training recurrent neural networks (RNNs) that possess long-term memory ...
research
05/22/2016

Inductive Bias of Deep Convolutional Networks through Pooling Geometry

Our formal understanding of the inductive bias that drives the success o...
research
02/26/2016

Architectural Complexity Measures of Recurrent Neural Networks

In this paper, we systematically analyze the connecting architectures of...
research
02/22/2016

Recurrent Orthogonal Networks and Long-Memory Tasks

Although RNNs have been shown to be powerful tools for processing sequen...

Please sign up or login with your details

Forgot password? Click here to reset