Structured, flexible, and robust: benchmarking and improving large language models towards more human-like behavior in out-of-distribution reasoning tasks

05/11/2022
by   Katherine M. Collins, et al.
8

Human language offers a powerful window into our thoughts – we tell stories, give explanations, and express our beliefs and goals through words. Abundant evidence also suggests that language plays a developmental role in structuring our learning. Here, we ask: how much of human-like thinking can be captured by learning statistical patterns in language alone? We first contribute a new challenge benchmark for comparing humans and distributional large language models (LLMs). Our benchmark contains two problem-solving domains (planning and explanation generation) and is designed to require generalization to new, out-of-distribution problems expressed in language. We find that humans are far more robust than LLMs on this benchmark. Next, we propose a hybrid Parse-and-Solve model, which augments distributional LLMs with a structured symbolic reasoning module. We find that this model shows more robust adaptation to out-of-distribution planning problems, demonstrating the promise of hybrid AI models for more human-like reasoning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/14/2022

Language models show human-like content effects on reasoning

Abstract reasoning is a key ability for an intelligent system. Large lan...
research
03/30/2023

Humans in Humans Out: On GPT Converging Toward Common Sense in both Success and Failure

Increase in computational scale and fine-tuning has seen a dramatic impr...
research
06/01/2023

TopEx: Topic-based Explanations for Model Comparison

Meaningfully comparing language models is challenging with current expla...
research
02/10/2023

Translating Natural Language to Planning Goals with Large-Language Models

Recent large language models (LLMs) have demonstrated remarkable perform...
research
06/13/2023

Synapse: Leveraging Few-Shot Exemplars for Human-Level Computer Control

This paper investigates the design of few-shot exemplars for computer au...
research
05/22/2023

Beneath Surface Similarity: Large Language Models Make Reasonable Scientific Analogies after Structure Abduction

Analogical reasoning is essential for human cognition, allowing us to co...
research
02/02/2023

QR-CLIP: Introducing Explicit Open-World Knowledge for Location and Time Reasoning

Daily images may convey abstract meanings that require us to memorize an...

Please sign up or login with your details

Forgot password? Click here to reset