Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification

07/20/2023
by   Neel Guha, et al.
0

Recent work has shown that language models' (LMs) prompt-based learning capabilities make them well suited for automating data labeling in domains where manual annotation is expensive. The challenge is that while writing an initial prompt is cheap, improving a prompt is costly – practitioners often require significant labeled data in order to evaluate the impact of prompt modifications. Our work asks whether it is possible to improve prompt-based learning without additional labeled data. We approach this problem by attempting to modify the predictions of a prompt, rather than the prompt itself. Our intuition is that accurate predictions should also be consistent: samples which are similar under some feature representation should receive the same prompt prediction. We propose Embroid, a method which computes multiple representations of a dataset under different embedding functions, and uses the consistency between the LM predictions for neighboring samples to identify mispredictions. Embroid then uses these neighborhoods to create additional predictions for each sample, and combines these predictions with a simple latent variable graphical model in order to generate a final corrected prediction. In addition to providing a theoretical analysis of Embroid, we conduct a rigorous empirical evaluation across six different LMs and up to 95 different tasks. We find that (1) Embroid substantially improves performance over original prompts (e.g., by an average of 7.3 points on GPT-JT), (2) also realizes improvements for more sophisticated prompting strategies (e.g., chain-of-thought), and (3) can be specialized to domains like law through the embedding functions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/25/2019

A Theoretical Analysis of Contrastive Unsupervised Representation Learning

Recent empirical works have successfully used unlabeled data to learn fe...
research
04/12/2023

Boosted Prompt Ensembles for Large Language Models

Methods such as chain-of-thought prompting and self-consistency have pus...
research
09/14/2023

An Empirical Evaluation of Prompting Strategies for Large Language Models in Zero-Shot Clinical Natural Language Processing

Large language models (LLMs) have shown remarkable capabilities in Natur...
research
11/15/2022

An Efficient Active Learning Pipeline for Legal Text Classification

Active Learning (AL) is a powerful tool for learning with less labeled d...
research
03/29/2023

AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotators

Many natural language processing (NLP) tasks rely on labeled data to tra...
research
06/24/2023

Can GPT-4 Support Analysis of Textual Data in Tasks Requiring Highly Specialized Domain Expertise?

We evaluated the capability of generative pre-trained transformers (GPT-...
research
02/18/2023

Scalable Prompt Generation for Semi-supervised Learning with Language Models

Prompt-based learning methods in semi-supervised learning (SSL) settings...

Please sign up or login with your details

Forgot password? Click here to reset