In-Context Learning in Large Language Models Learns Label Relationships but Is Not Conventional Learning

07/23/2023
by   Jannik Kossen, et al.
0

The performance of Large Language Models (LLMs) on downstream tasks often improves significantly when including examples of the input-label relationship in the context. However, there is currently no consensus about how this in-context learning (ICL) ability of LLMs works: for example, while Xie et al. (2021) liken ICL to a general-purpose learning algorithm, Min et al. (2022b) argue ICL does not even learn label relationships from in-context examples. In this paper, we study (1) how labels of in-context examples affect predictions, (2) how label relationships learned during pre-training interact with input-label examples provided in-context, and (3) how ICL aggregates label information across in-context examples. Our findings suggests LLMs usually incorporate information from in-context labels, but that pre-training and in-context label relationships are treated differently, and that the model does not consider all in-context information equally. Our results give insights into understanding and aligning LLM behavior.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/28/2023

Investigating the Learning Behaviour of In-context Learning: A Comparison with Supervised Learning

Large language models (LLMs) have shown remarkable capacity for in-conte...
research
05/26/2023

A Mechanism for Sample-Efficient In-Context Learning for Sparse Retrieval Tasks

We study the phenomenon of in-context learning (ICL) exhibited by large ...
research
05/23/2023

Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning

In-context learning (ICL) emerges as a promising capability of large lan...
research
03/07/2023

Larger language models do in-context learning differently

We study how in-context learning (ICL) in language models is affected by...
research
12/19/2022

Training Trajectories of Language Models Across Scales

Scaling up language models has led to unprecedented performance gains, b...
research
05/16/2023

What In-Context Learning "Learns" In-Context: Disentangling Task Recognition and Task Learning

Large language models (LLMs) exploit in-context learning (ICL) to solve ...
research
03/23/2023

Fairness-guided Few-shot Prompting for Large Language Models

Large language models have demonstrated surprising ability to perform in...

Please sign up or login with your details

Forgot password? Click here to reset