Understanding Recurrent Neural Architectures by Analyzing and Synthesizing Long Distance Dependencies in Benchmark Sequential Datasets
At present, the state-of-the-art computational models across a range of sequential data processing tasks, including language modeling, are based on recurrent neural network architectures. This paper begins with the observation that most research on developing computational models capable of processing sequential data fails to explicitly analyze the long distance dependencies (LDDs) within the datasets the models process. In this context, in this paper, we make five research contributions. First, we argue that a key step in modeling sequential data is to understand the characteristics of the LDDs within the data. Second, we present a method to compute and analyze the LDD characteristics of any sequential dataset, and demonstrate this method on a number of sequential datasets that are frequently used for model benchmarking. Third, based on the analysis of the LDD characteristics within the benchmarking datasets, we observe that LDDs are far more complex than previously assumed, and depend on at least four factors: (i) the number of unique symbols in a dataset, (ii) size of the dataset, (iii) the number of interacting symbols within an LDD, and (iv) the distance between the interacting symbols. Fourth, we verify these factors by using synthetic datasets generated using Strictly k-Piecewise (SPk) languages. We then demonstrate how SPk languages can be used to generate benchmarking datasets with varying degrees of LDDs. The advantage of these synthesized datasets being that they enable the targeted testing of recurrent neural architectures. Finally, we demonstrate how understanding the characteristics of the LDDs in a dataset can inform better hyper-parameter selection for current state-of-the-art recurrent neural architectures and also aid in understanding them...
READ FULL TEXT