MA-Dreamer: Coordination and communication through shared imagination

by   Kenzo Lobos-Tsunekawa, et al.

Multi-agent RL is rendered difficult due to the non-stationary nature of environment perceived by individual agents. Theoretically sound methods using the REINFORCE estimator are impeded by its high-variance, whereas value-function based methods are affected by issues stemming from their ad-hoc handling of situations like inter-agent communication. Methods like MADDPG are further constrained due to their requirement of centralized critics etc. In order to address these issues, we present MA-Dreamer, a model-based method that uses both agent-centric and global differentiable models of the environment in order to train decentralized agents' policies and critics using model-rollouts a.k.a `imagination'. Since only the model-training is done off-policy, inter-agent communication/coordination and `language emergence' can be handled in a straight-forward manner. We compare the performance of MA-Dreamer with other methods on two soccer-based games. Our experiments show that in long-term speaker-listener tasks and in cooperative games with strong partial-observability, MA-Dreamer finds a solution that makes effective use of coordination, whereas competing methods obtain marginal scores and fail outright, respectively. By effectively achieving coordination and communication under more relaxed and general conditions, out method opens the door to the study of more complex problems and population-based training.


page 1

page 2

page 3

page 4


The Emergence of Adversarial Communication in Multi-Agent Reinforcement Learning

Many real-world problems require the coordination of multiple autonomous...

Stateful active facilitator: Coordination and Environmental Heterogeneity in Cooperative Multi-Agent Reinforcement Learning

In cooperative multi-agent reinforcement learning, a team of agents work...

Signal Instructed Coordination in Cooperative Multi-agent Reinforcement Learning

In many real-world problems, a team of agents need to collaborate to max...

Learning Generalizable Risk-Sensitive Policies to Coordinate in Decentralized Multi-Agent General-Sum Games

While various multi-agent reinforcement learning methods have been propo...

Inference-Based Deterministic Messaging For Multi-Agent Communication

Communication is essential for coordination among humans and animals. Th...

Learning Existing Social Conventions in Markov Games

In order for artificial agents to coordinate effectively with people, th...

Planning Not to Talk: Multiagent Systems that are Robust to Communication Loss

In a cooperative multiagent system, a collection of agents executes a jo...

Please sign up or login with your details

Forgot password? Click here to reset