Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

07/28/2021
by   Pengfei Liu, et al.
0

This paper surveys and organizes research works in a new paradigm in natural language processing, which we dub "prompt-based learning". Unlike traditional supervised learning, which trains a model to take in an input x and predict an output y as P(y|x), prompt-based learning is based on language models that model the probability of text directly. To use these models to perform prediction tasks, the original input x is modified using a template into a textual string prompt x' that has some unfilled slots, and then the language model is used to probabilistically fill the unfilled information to obtain a final string x, from which the final output y can be derived. This framework is powerful and attractive for a number of reasons: it allows the language model to be pre-trained on massive amounts of raw text, and by defining a new prompting function the model is able to perform few-shot or even zero-shot learning, adapting to new scenarios with few or no labeled data. In this paper we introduce the basics of this promising paradigm, describe a unified set of mathematical notations that can cover a wide variety of existing work, and organize existing work along several dimensions, e.g.the choice of pre-trained models, prompts, and tuning strategies. To make the field more accessible to interested beginners, we not only make a systematic review of existing works and a highly structured typology of prompt-based concepts, but also release other resources, e.g., a website http://pretrain.nlpedia.ai/ including constantly-updated survey, and paperlist.

READ FULL TEXT
research
03/09/2022

HealthPrompt: A Zero-shot Learning Paradigm for Clinical Natural Language Processing

Deep learning algorithms are dependent on the availability of large-scal...
research
09/30/2022

What Makes Pre-trained Language Models Better Zero/Few-shot Learners?

In this paper, we propose a theoretical framework to explain the efficac...
research
09/21/2022

WeLM: A Well-Read Pre-trained Language Model for Chinese

Large Language Models pre-trained with self-supervised learning have dem...
research
09/29/2022

Bidirectional Language Models Are Also Few-shot Learners

Large language models such as GPT-3 (Brown et al., 2020) can perform arb...
research
08/15/2023

Through the Lens of Core Competency: Survey on Evaluation of Large Language Models

From pre-trained language model (PLM) to large language model (LLM), the...
research
03/13/2023

A Survey of Graph Prompting Methods: Techniques, Applications, and Challenges

While deep learning has achieved great success on various tasks, the tas...
research
07/08/2021

A Systematic Survey of Text Worlds as Embodied Natural Language Environments

Text Worlds are virtual environments for embodied agents that, unlike 2D...

Please sign up or login with your details

Forgot password? Click here to reset