Causal Transformers Perform Below Chance on Recursive Nested Constructions, Unlike Humans

10/14/2021
by   Yair Lakretz, et al.
10

Recursive processing is considered a hallmark of human linguistic abilities. A recent study evaluated recursive processing in recurrent neural language models (RNN-LMs) and showed that such models perform below chance level on embedded dependencies within nested constructions – a prototypical example of recursion in natural language. Here, we study if state-of-the-art Transformer LMs do any better. We test four different Transformer LMs on two different types of nested constructions, which differ in whether the embedded (inner) dependency is short or long range. We find that Transformers achieve near-perfect performance on short-range embedded dependencies, significantly better than previous results reported for RNN-LMs and humans. However, on long-range embedded dependencies, Transformers' performance sharply drops below chance level. Remarkably, the addition of only three words to the embedded dependency caused Transformers to fall from near-perfect to below-chance performance. Taken together, our results reveal Transformers' shortcoming when it comes to recursive, structure-based, processing.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/19/2020

Exploring Processing of Nested Dependencies in Neural-Network Language Models and Humans

Recursive processing in sentence comprehension is considered a hallmark ...
research
07/07/2020

Do Transformers Need Deep Long-Range Memory

Deep attention models have advanced the modelling of sequential data acr...
research
04/17/2019

Dynamic Evaluation of Transformer Language Models

This research note combines two methods that have recently improved the ...
research
02/13/2023

A Unified View of Long-Sequence Models towards Modeling Million-Scale Dependencies

Ever since their conception, Transformers have taken over traditional se...
research
06/18/2019

Multiple Testing Embedded in an Aggregation Tree to Identify where Two Distributions Differ

A key goal of flow cytometry data analysis is to identify the subpopulat...
research
01/06/2021

Can RNNs learn Recursive Nested Subject-Verb Agreements?

One of the fundamental principles of contemporary linguistics states tha...
research
07/20/2022

Action Quality Assessment using Transformers

Action quality assessment (AQA) is an active research problem in video-b...

Please sign up or login with your details

Forgot password? Click here to reset