The relational processing limits of classic and contemporary neural network models of language processing

by   Guillermo Puebla, et al.

The ability of neural networks to capture relational knowledge is a matter of long-standing controversy. Recently, some researchers in the PDP side of the debate have argued that (1) classic PDP models can handle relational structure (Rogers & McClelland, 2008, 2014) and (2) the success of deep learning approaches to text processing suggests that structured representations are unnecessary to capture the gist of human language (Rabovsky et al., 2018). In the present study we tested the Story Gestalt model (St. John, 1992), a classic PDP model of text comprehension, and a Sequence-to-Sequence with Attention model (Bahdanau et al., 2015), a contemporary deep learning architecture for text processing. Both models were trained to answer questions about stories based on the thematic roles that several concepts played on the stories. In three critical test we varied the statistical structure of new stories while keeping their relational structure constant with respect to the training data. Each model was susceptible to each statistical structure manipulation to a different degree, with their performance failing below chance at least under one manipulation. We argue that the failures of both models are due to the fact that they cannotperform dynamic binding of independent roles and fillers. Ultimately, these results cast doubts onthe suitability of traditional neural networks models for explaining phenomena based on relational reasoning, including language processing.


page 1

page 2

page 3

page 4


The Importance of Being Recurrent for Modeling Hierarchical Structure

Recent work has shown that recurrent neural networks (RNNs) can implicit...

Recurrent Relational Networks for Complex Relational Reasoning

Humans possess an ability to abstractly reason about objects and their i...

Relational Learning and Feature Extraction by Querying over Heterogeneous Information Networks

Many real world systems need to operate on heterogeneous information net...

RoBERTa-wwm-ext Fine-Tuning for Chinese Text Classification

Bidirectional Encoder Representations from Transformers (BERT) have show...

Lifted Representation of Relational Causal Models Revisited: Implications for Reasoning and Structure Learning

Maier et al. (2010) introduced the relational causal model (RCM) for rep...

Assessing the Limits of the Distributional Hypothesis in Semantic Spaces: Trait-based Relational Knowledge and the Impact of Co-occurrences

The increase in performance in NLP due to the prevalence of distribution...

Please sign up or login with your details

Forgot password? Click here to reset