Lattice Recurrent Unit: Improving Convergence and Statistical Efficiency for Sequence Modeling

10/06/2017
by   Chaitanya Ahuja, et al.
0

Recurrent neural networks have shown remarkable success in modeling sequences. However low resource situations still adversely affect the generalizability of these models. We introduce a new family of models, called Lattice Recurrent Units (LRU), to address the challenge of learning deep multi-layer recurrent models with limited resources. LRU models achieve this goal by creating distinct (but coupled) flow of information inside the units: a first flow along time dimension and a second flow along depth dimension. It also offers a symmetry in how information can flow horizontally and vertically. We analyze the effects of decoupling three different components of our LRU model: Reset Gate, Update Gate and Projected State. We evaluate this family on new LRU models on computational convergence rates and statistical efficiency. Our experiments are performed on four publicly-available datasets, comparing with Grid-LSTM and Recurrent Highway networks. Our results show that LRU has better empirical computational convergence rates and statistical efficiency values, along with learning more accurate language models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/12/2017

Simplified Minimal Gated Unit Variations for Recurrent Neural Networks

Recurrent neural networks with various types of hidden units have been u...
research
10/04/2022

Fast Saturating Gate for Learning Long Time Scales with Recurrent Neural Networks

Gate functions in recurrent models, such as an LSTM and GRU, play a cent...
research
01/22/2019

Reducing state updates via Gaussian-gated LSTMs

Recurrent neural networks can be difficult to train on long sequence dat...
research
05/23/2018

Highway State Gating for Recurrent Highway Networks: improving information flow through time

Recurrent Neural Networks (RNNs) play a major role in the field of seque...
research
12/12/2018

Bayesian Sparsification of Gated Recurrent Neural Networks

Bayesian methods have been successfully applied to sparsify weights of n...
research
10/14/2019

Uniform convergence rates for the approximated halfspace and projection depth

The computational complexity of some depths that satisfy the projection ...

Please sign up or login with your details

Forgot password? Click here to reset