RoCo: Dialectic Multi-Robot Collaboration with Large Language Models

by   Zhao Mandi, et al.

We propose a novel approach to multi-robot collaboration that harnesses the power of pre-trained large language models (LLMs) for both high-level communication and low-level path planning. Robots are equipped with LLMs to discuss and collectively reason task strategies. They then generate sub-task plans and task space waypoint paths, which are used by a multi-arm motion planner to accelerate trajectory planning. We also provide feedback from the environment, such as collision checking, and prompt the LLM agents to improve their plan and waypoints in-context. For evaluation, we introduce RoCoBench, a 6-task benchmark covering a wide range of multi-robot collaboration scenarios, accompanied by a text-only dataset for agent representation and reasoning. We experimentally demonstrate the effectiveness of our approach – it achieves high success rates across all tasks in RoCoBench and adapts to variations in task semantics. Our dialog setup offers high interpretability and flexibility – in real world experiments, we show RoCo easily incorporates human-in-the-loop, where a user can communicate and collaborate with a robot agent to complete tasks together. See project website for videos and code.


page 1

page 3

page 6

page 7

page 13

page 14


SMART-LLM: Smart Multi-Agent Robot Task Planning using Large Language Models

In this work, we introduce SMART-LLM, an innovative framework designed f...

Statler: State-Maintaining Language Models for Embodied Reasoning

Large language models (LLMs) provide a promising tool that enable robots...

Prompt a Robot to Walk with Large Language Models

Large language models (LLMs) pre-trained on vast internet-scale data hav...

Conformal Temporal Logic Planning using Large Language Models: Knowing When to Do What and When to Ask for Help

This paper addresses a new motion planning problem for mobile robots tas...

AlphaBlock: Embodied Finetuning for Vision-Language Reasoning in Robot Manipulation

We propose a novel framework for learning high-level cognitive capabilit...

Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents

In this paper, we study the problem of planning in Minecraft, a popular,...

Reshaping Robot Trajectories Using Natural Language Commands: A Study of Multi-Modal Data Alignment Using Transformers

Natural language is the most intuitive medium for us to interact with ot...

Please sign up or login with your details

Forgot password? Click here to reset