GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints

05/22/2023
by   Joshua Ainslie, et al.
0

Multi-query attention (MQA), which only uses a single key-value head, drastically speeds up decoder inference. However, MQA can lead to quality degradation, and moreover it may not be desirable to train a separate model just for faster inference. We (1) propose a recipe for uptraining existing multi-head language model checkpoints into models with MQA using 5 pre-training compute, and (2) introduce grouped-query attention (GQA), a generalization of multi-query attention which uses an intermediate (more than one, less than number of query heads) number of key-value heads. We show that uptrained GQA achieves quality close to multi-head attention with comparable speed to MQA.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/11/2021

EL-Attention: Memory Efficient Lossless Attention for Generation

Transformer model with multi-head attention requires caching intermediat...
research
10/18/2021

Compositional Attention: Disentangling Search and Retrieval

Multi-head, key-value attention is the backbone of the widely successful...
research
06/29/2020

Multi-Head Attention: Collaborate Instead of Concatenate

Attention layers are widely used in natural language processing (NLP) an...
research
11/06/2019

Fast Transformer Decoding: One Write-Head is All You Need

Multi-head attention layers, as used in the Transformer neural sequence ...
research
04/13/2021

What's in your Head? Emergent Behaviour in Multi-Task Transformer Models

The primary paradigm for multi-task training in natural language process...
research
07/21/2023

What can a Single Attention Layer Learn? A Study Through the Random Features Lens

Attention layers – which map a sequence of inputs to a sequence of outpu...
research
02/14/2021

Query-by-Example Keyword Spotting system using Multi-head Attention and Softtriple Loss

This paper proposes a neural network architecture for tackling the query...

Please sign up or login with your details

Forgot password? Click here to reset