PMIC: Improving Multi-Agent Reinforcement Learning with Progressive Mutual Information Collaboration

by   Pengyi Li, et al.

Learning to collaborate is critical in Multi-Agent Reinforcement Learning (MARL). Previous works promote collaboration by maximizing the correlation of agents' behaviors, which is typically characterized by Mutual Information (MI) in different forms. However, we reveal sub-optimal collaborative behaviors also emerge with strong correlations, and simply maximizing the MI can, surprisingly, hinder the learning towards better collaboration. To address this issue, we propose a novel MARL framework, called Progressive Mutual Information Collaboration (PMIC), for more effective MI-driven collaboration. PMIC uses a new collaboration criterion measured by the MI between global states and joint actions. Based on this criterion, the key idea of PMIC is maximizing the MI associated with superior collaborative behaviors and minimizing the MI associated with inferior ones. The two MI objectives play complementary roles by facilitating better collaborations while avoiding falling into sub-optimal ones. Experiments on a wide range of MARL benchmarks show the superior performance of PMIC compared with other algorithms.


A Maximum Mutual Information Framework for Multi-Agent Reinforcement Learning

In this paper, we propose a maximum mutual information (MMI) framework f...

A Variational Approach to Mutual Information-Based Coordination for Multi-Agent Reinforcement Learning

In this paper, we propose a new mutual information framework for multi-a...

Maximum Joint Entropy and Information-Based Collaboration of Automated Learning Machines

We are working to develop automated intelligent agents, which can act an...

LINDA: Multi-Agent Local Information Decomposition for Awareness of Teammates

In cooperative multi-agent reinforcement learning (MARL), where agents o...

Multi-agent Collaboration for Feasible Collaborative Behavior Construction and Evaluation

In the case of the two-person zero-sum stochastic game with a central co...

Learning to Simulate Self-Driven Particles System with Coordinated Policy Optimization

Self-Driven Particles (SDP) describe a category of multi-agent systems c...

Effective and Stable Role-Based Multi-Agent Collaboration by Structural Information Principles

Role-based learning is a promising approach to improving the performance...

Please sign up or login with your details

Forgot password? Click here to reset