Out of One, Many: Using Language Models to Simulate Human Samples

09/14/2022
by   Lisa P. Argyle, et al.
0

We propose and explore the possibility that language models can be studied as effective proxies for specific human sub-populations in social science research. Practical and research applications of artificial intelligence tools have sometimes been limited by problematic biases (such as racism or sexism), which are often treated as uniform properties of the models. We show that the "algorithmic bias" within one such tool – the GPT-3 language model – is instead both fine-grained and demographically correlated, meaning that proper conditioning will cause it to accurately emulate response distributions from a wide variety of human subgroups. We term this property "algorithmic fidelity" and explore its extent in GPT-3. We create "silicon samples" by conditioning the model on thousands of socio-demographic backstories from real human participants in multiple large surveys conducted in the United States. We then compare the silicon and human samples to demonstrate that the information contained in GPT-3 goes far beyond surface similarity. It is nuanced, multifaceted, and reflects the complex interplay between ideas, attitudes, and socio-cultural context that characterize human attitudes. We suggest that language models with sufficient algorithmic fidelity thus constitute a novel and powerful tool to advance understanding of humans and society across a variety of disciplines.

READ FULL TEXT

page 7

page 18

page 35

research
09/06/2023

Framework-Based Qualitative Analysis of Free Responses of Large Language Models: Algorithmic Fidelity

Today, using Large-scale generative Language Models (LLMs) it is possibl...
research
06/03/2023

Towards Coding Social Science Datasets with Language Models

Researchers often rely on humans to code (label, annotate, etc.) large s...
research
10/06/2022

Prompt Compression and Contrastive Conditioning for Controllability and Toxicity Reduction in Language Models

We explore the idea of compressing the prompts used to condition languag...
research
07/10/2023

Demonstrations of the Potential of AI-based Political Issue Polling

Political polling is a multi-billion dollar industry with outsized influ...
research
04/19/2023

Supporting Human-AI Collaboration in Auditing LLMs with LLMs

Large language models are becoming increasingly pervasive and ubiquitous...
research
05/11/2023

Large Language Models Can Be Used To Effectively Scale Spear Phishing Campaigns

Recent progress in artificial intelligence (AI), particularly in the dom...
research
09/28/2022

Who is GPT-3? An Exploration of Personality, Values and Demographics

Language models such as GPT-3 have caused a furore in the research commu...

Please sign up or login with your details

Forgot password? Click here to reset