BatchPrompt: Accomplish more with less

09/01/2023
by   Jianzhe Lin, et al.
0

As the ever-increasing token limits of large language models (LLMs) have enabled long context as input, prompting with single data samples might no longer an efficient way. A straightforward strategy improving efficiency is to batch data within the token limit (e.g., 8k for gpt-3.5-turbo; 32k for GPT-4), which we call BatchPrompt. We have two initial observations for prompting with batched data. First, we find that prompting with batched data in longer contexts will inevitably lead to worse performance, compared to single-data prompting. Second, the performance of the language model is significantly correlated with the positions and order of the batched data, due to the corresponding change in decoder context. To retain efficiency and overcome performance loss, we propose Batch Permutation and Ensembling (BPE), and a novel Self-reflection-guided EArly Stopping (SEAS) technique. Our comprehensive experimental evaluation demonstrates that BPE can boost the performance of BatchPrompt with a striking margin on a range of popular NLP tasks, including question answering (Boolq), textual entailment (RTE), and duplicate questions identification (QQP). These performances are even competitive with/higher than single-data prompting(SinglePrompt), while BatchPrompt requires much fewer LLM calls and input tokens (For SinglePrompt v.s. BatchPrompt with batch size 32, using just 9 27.4 to 91.1 work to technically improve prompting efficiency of large language models. We hope our simple yet effective approach will shed light on the future research of large language models. The code will be released.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/06/2023

Lost in the Middle: How Language Models Use Long Contexts

While recent language models have the ability to take long contexts as i...
research
05/08/2023

A Frustratingly Easy Improvement for Position Embeddings via Random Padding

Position embeddings, encoding the positional relationships among tokens ...
research
12/18/2022

Can Retriever-Augmented Language Models Reason? The Blame Game Between the Retriever and the Language Model

The emergence of large pretrained models has enabled language models to ...
research
12/30/2022

Black-box language model explanation by context length probing

The increasingly widespread adoption of large language models has highli...
research
01/19/2023

Batch Prompting: Efficient Inference with Large Language Model APIs

Performing inference on hundreds of thousands of samples with large lang...
research
08/23/2023

D4: Improving LLM Pretraining via Document De-Duplication and Diversification

Over recent years, an increasing amount of compute and data has been pou...
research
09/09/2023

Neurons in Large Language Models: Dead, N-gram, Positional

We analyze a family of large language models in such a lightweight manne...

Please sign up or login with your details

Forgot password? Click here to reset