Soft Hierarchical Graph Recurrent Networks for Many-Agent Partially Observable Environments

by   Zhenhui Ye, et al.
Zhejiang University

The recent progress in multi-agent deep reinforcement learning(MADRL) makes it more practical in real-world tasks, but its relatively poor scalability and the partially observable constraints raise challenges to its performance and deployment. Based on our intuitive observation that the human society could be regarded as a large-scale partially observable environment, where each individual has the function of communicating with neighbors and remembering its own experience, we propose a novel network structure called hierarchical graph recurrent network(HGRN) for multi-agent cooperation under partial observability. Specifically, we construct the multi-agent system as a graph, use the hierarchical graph attention network(HGAT) to achieve communication between neighboring agents, and exploit GRU to enable agents to record historical information. To encourage exploration and improve robustness, we design a maximum-entropy learning method to learn stochastic policies of a configurable target action entropy. Based on the above technologies, we proposed a value-based MADRL algorithm called Soft-HGRN and its actor-critic variant named SAC-HRGN. Experimental results based on three homogeneous tasks and one heterogeneous environment not only show that our approach achieves clear improvements compared with four baselines, but also demonstrates the interpretability, scalability, and transferability of the proposed model. Ablation studies prove the function and necessity of each component.


page 1

page 2

page 3

page 4


SACHA: Soft Actor-Critic with Heuristic-Based Attention for Partially Observable Multi-Agent Path Finding

Multi-Agent Path Finding (MAPF) is a crucial component for many large-sc...

Entropy Enhanced Multi-Agent Coordination Based on Hierarchical Graph Learning for Continuous Action Space

In most existing studies on large-scale multi-agent coordination, the co...

More Like Real World Game Challenge for Partially Observable Multi-Agent Cooperation

Some standardized environments have been designed for partially observab...

Multi-Agent Actor-Critic with Hierarchical Graph Attention Network

Most previous studies on multi-agent reinforcement learning focus on der...

R-MADDPG for Partially Observable Environments and Limited Communication

There are several real-world tasks that would ben-efit from applying mul...

A Decentralized Communication Framework based on Dual-Level Recurrence for Multi-Agent Reinforcement Learning

We propose a model enabling decentralized multiple agents to share their...

IMP-MARL: a Suite of Environments for Large-scale Infrastructure Management Planning via MARL

We introduce IMP-MARL, an open-source suite of multi-agent reinforcement...

Please sign up or login with your details

Forgot password? Click here to reset