ETC: Encoding Long and Structured Data in Transformers

04/17/2020
by   Joshua Ainslie, et al.
Google
46

Transformer-based models have pushed the state of the art in many natural language processing tasks. However, one of their main limitations is the quadratic computational and memory cost of the standard attention mechanism. In this paper, we present a new family of Transformer models, which we call the Extended Transformer Construction (ETC), that allows for significant increases in input sequence length by introducing a new global-local attention mechanism between a global memory and the standard input tokens. We also show that combining global-local attention with relative position encodings allows ETC to handle structured data with ease. Empirical results on the Natural Questions data set show the promise of the approach.

READ FULL TEXT

page 5

page 6

07/14/2022

Recurrent Memory Transformer

Transformer-based models show their effectiveness across multiple domain...
12/15/2021

LongT5: Efficient Text-To-Text Transformer for Long Sequences

Recent work has shown that either (1) increasing the input length or (2)...
07/28/2020

Big Bird: Transformers for Longer Sequences

Transformers-based models, such as BERT, have been one of the most succe...
12/30/2021

ChunkFormer: Learning Long Time Series with Multi-stage Chunked Transformer

The analysis of long sequence data remains challenging in many real-worl...
05/04/2023

AttentionViz: A Global View of Transformer Attention

Transformer models are revolutionizing machine learning, but their inner...
05/25/2023

Landmark Attention: Random-Access Infinite Context Length for Transformers

While transformers have shown remarkable success in natural language pro...
06/05/2020

GMAT: Global Memory Augmentation for Transformers

Transformer-based models have become ubiquitous in natural language proc...

Code Repositories

unanswerable_qa

The official implementation for ACL 2021 "Challenges in Information Seeking QA: Unanswerable Questions and Paragraph Retrieval".


view repo

Please sign up or login with your details

Forgot password? Click here to reset