Reasoning with Language Model is Planning with World Model

05/24/2023
by   Shibo Hao, et al.
0

Large language models (LLMs) have shown remarkable reasoning capabilities, especially when prompted to generate intermediate reasoning steps (e.g., Chain-of-Thought, CoT). However, LLMs can still struggle with problems that are easy for humans, such as generating action plans for executing tasks in a given environment, or performing complex math, logical, and commonsense reasoning. The deficiency stems from the key fact that LLMs lack an internal world model to predict the world state (e.g., environment status, intermediate variable values) and simulate long-term outcomes of actions. This prevents LLMs from performing deliberate planning akin to human brains, which involves exploring alternative reasoning paths, anticipating future states and rewards, and iteratively refining existing reasoning steps. To overcome the limitations, we propose a new LLM reasoning framework, Reasoning viaPlanning (RAP). RAP repurposes the LLM as both a world model and a reasoning agent, and incorporates a principled planning algorithm (based on Monto Carlo Tree Search) for strategic exploration in the vast reasoning space. During reasoning, the LLM (as agent) incrementally builds a reasoning tree under the guidance of the LLM (as world model) and task-specific rewards, and obtains a high-reward reasoning path efficiently with a proper balance between exploration vs. exploitation. We apply RAP to a variety of challenging reasoning problems including plan generation, math reasoning, and logical inference. Empirical results on these tasks demonstrate the superiority of RAP over various strong baselines, including CoT and least-to-most prompting with self-consistency. RAP on LLAMA-33B surpasses CoT on GPT-4 with 33 relative improvement in a plan generation setting.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/04/2023

REFINER: Reasoning Feedback on Intermediate Representations

Language models (LMs) have recently shown remarkable performance on reas...
research
03/16/2023

ART: Automatic multi-step reasoning and tool-use for large language models

Large language models (LLMs) can perform complex reasoning in few- and z...
research
05/04/2023

Faithful Question Answering with Monte-Carlo Planning

Although large language models demonstrate remarkable question-answering...
research
05/24/2023

The Art of SOCRATIC QUESTIONING: Zero-shot Multimodal Reasoning with Recursive Thinking and Self-Questioning

Chain-of-Thought prompting (CoT) enables large-scale language models to ...
research
10/06/2022

ReAct: Synergizing Reasoning and Acting in Language Models

While large language models (LLMs) have demonstrated impressive capabili...
research
09/14/2023

Tree of Uncertain Thoughts Reasoning for Large Language Models

While the recently introduced Tree of Thoughts (ToT) has heralded advanc...
research
08/29/2023

TaskLAMA: Probing the Complex Task Understanding of Language Models

Structured Complex Task Decomposition (SCTD) is the problem of breaking ...

Please sign up or login with your details

Forgot password? Click here to reset