ReadOnce Transformers: Reusable Representations of Text for Transformers

10/24/2020
by   Shih-Ting Lin, et al.
0

While large-scale language models are extremely effective when directly fine-tuned on many end-tasks, such models learn to extract information and solve the task simultaneously from end-task supervision. This is wasteful, as the general problem of gathering information from a document is mostly task-independent and need not be re-learned from scratch each time. Moreover, once the information has been captured in a computable representation, it can now be re-used across examples, leading to faster training and evaluation of models. We present a transformer-based approach, ReadOnce Transformers, that is trained to build such information-capturing representations of text. Our model compresses the document into a variable-length task-independent representation that can now be re-used in different examples and tasks, thereby requiring a document to only be read once. Additionally, we extend standard text-to-text models to consume our ReadOnce Representations along with text to solve multiple downstream tasks. We show our task-independent representations can be used for multi-hop QA, abstractive QA, and summarization. We observe 2x-5x speedups compared to standard text-to-text models, while also being able to handle long documents that would normally exceed the length limit of current models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/25/2022

Learning General Audio Representations with Large-Scale Training of Patchout Audio Transformers

The success of supervised deep learning methods is largely due to their ...
research
04/15/2020

SPECTER: Document-level Representation Learning using Citation-informed Transformers

Representation learning is a critical ingredient for natural language pr...
research
10/20/2021

Contrastive Document Representation Learning with Graph Attention Networks

Recent progress in pretrained Transformer-based language models has show...
research
10/04/2018

Zooming Network

Structural information is important in natural language understanding. A...
research
03/21/2022

Efficient Classification of Long Documents Using Transformers

Several methods have been proposed for classifying long textual document...
research
01/25/2017

Universal representations:The missing link between faces, text, planktons, and cat breeds

With the advent of large labelled datasets and high-capacity models, the...
research
01/09/2023

MAQA: A Multimodal QA Benchmark for Negation

Multimodal learning can benefit from the representation power of pretrai...

Please sign up or login with your details

Forgot password? Click here to reset