Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics

10/04/2021
by   Prajjwal Bhargava, et al.
2

Much of recent progress in NLU was shown to be due to models' learning dataset-specific heuristics. We conduct a case study of generalization in NLI (from MNLI to the adversarially constructed HANS dataset) in a range of BERT-based architectures (adapters, Siamese Transformers, HEX debiasing), as well as with subsampling the data and increasing the model size. We report 2 successful and 3 unsuccessful strategies, all providing insights into how Transformer-based models learn to generalize.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/27/2020

Compressing Large-Scale Transformer-Based Models: A Case Study on BERT

Transformer-based models pre-trained on large-scale corpora achieve stat...
research
08/07/2023

Deepfake Detection: A Comparative Analysis

This paper present a comprehensive comparative analysis of supervised an...
research
03/15/2023

Transformer Models for Type Inference in the Simply Typed Lambda Calculus: A Case Study in Deep Learning for Code

Despite a growing body of work at the intersection of deep learning and ...
research
10/23/2022

When Can Transformers Ground and Compose: Insights from Compositional Generalization Benchmarks

Humans can reason compositionally whilst grounding language utterances t...
research
08/26/2021

The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers

Recently, many datasets have been proposed to test the systematic genera...
research
08/31/2019

Quantity doesn't buy quality syntax with neural language models

Recurrent neural networks can learn to predict upcoming words remarkably...

Please sign up or login with your details

Forgot password? Click here to reset