Minimum Coverage Sets for Training Robust Ad Hoc Teamwork Agents

08/18/2023
by   Arrasy Rahman, et al.
0

Robustly cooperating with unseen agents and human partners presents significant challenges due to the diverse cooperative conventions these partners may adopt. Existing Ad Hoc Teamwork (AHT) methods address this challenge by training an agent with a population of diverse teammate policies obtained through maximizing specific diversity metrics. However, these heuristic diversity metrics do not always maximize the agent's robustness in all cooperative problems. In this work, we first propose that maximizing an AHT agent's robustness requires it to emulate policies in the minimum coverage set (MCS), the set of best-response policies to any partner policies in the environment. We then introduce the L-BRDiv algorithm that generates a set of teammate policies that, when used for AHT training, encourage agents to emulate policies from the MCS. L-BRDiv works by solving a constrained optimization problem to jointly train teammate policies for AHT training and approximating AHT agent policies that are members of the MCS. We empirically demonstrate that L-BRDiv produces more robust AHT agents than state-of-the-art methods in a broader range of two-player cooperative problems without the need for extensive hyperparameter tuning for its objectives. Our study shows that L-BRDiv outperforms the baseline methods by prioritizing discovering distinct members of the MCS instead of repeatedly finding redundant policies.

READ FULL TEXT

page 6

page 7

research
07/28/2022

Towards Robust Ad Hoc Teamwork Agents By Creating Diverse Training Teammates

Ad hoc teamwork (AHT) is the problem of creating an agent that must coll...
research
04/28/2020

Generating and Adapting to Diverse Ad-Hoc Cooperation Agents in Hanab

Hanabi is a cooperative game that brings the problem of modeling other p...
research
06/26/2022

Generalized Beliefs for Cooperative AI

Self-play is a common paradigm for constructing solutions in Markov game...
research
04/28/2020

Generating and Adapting to Diverse Ad-Hoc Cooperation Agents in Hanabi

Hanabi is a cooperative game that brings the problem of modeling other p...
research
11/05/2021

Learning to Cooperate with Unseen Agent via Meta-Reinforcement Learning

Ad hoc teamwork problem describes situations where an agent has to coope...
research
03/07/2021

Adaptive Agent Architecture for Real-time Human-Agent Teaming

Teamwork is a set of interrelated reasoning, actions and behaviors of te...
research
01/06/2023

Centralized Cooperative Exploration Policy for Continuous Control Tasks

The deep reinforcement learning (DRL) algorithm works brilliantly on sol...

Please sign up or login with your details

Forgot password? Click here to reset