Parameter Sharing Deep Deterministic Policy Gradient for Cooperative Multi-agent Reinforcement Learning

10/01/2017
by   Xiangxiang Chu, et al.
0

Deep reinforcement learning for multi-agent cooperation and competition has been a hot topic recently. This paper focuses on cooperative multi-agent problem based on actor-critic methods under local observations settings. Multi agent deep deterministic policy gradient obtained state of art results for some multi-agent games, whereas, it cannot scale well with growing amount of agents. In order to boost scalability, we propose a parameter sharing deterministic policy gradient method with three variants based on neural networks, including actor-critic sharing, actor sharing and actor sharing with partially shared critic. Benchmarks from rllab show that the proposed method has advantages in learning speed and memory efficiency, well scales with growing amount of agents, and moreover, it can make full use of reward sharing and exchangeability if possible.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/15/2019

A Multi-Agent Off-Policy Actor-Critic Algorithm for Distributed Reinforcement Learning

This paper extends off-policy reinforcement learning to the multi-agent ...
research
06/12/2020

Shared Experience Actor-Critic for Multi-Agent Reinforcement Learning

Exploration in multi-agent reinforcement learning is a challenging probl...
research
09/14/2020

Deep Actor-Critic Learning for Distributed Power Control in Wireless Mobile Networks

Deep reinforcement learning offers a model-free alternative to supervise...
research
02/08/2019

Hierarchical Critics Assignment for Multi-agent Reinforcement Learning

In this paper, we investigate the use of global information to speed up ...
research
09/02/2022

Semi-Centralised Multi-Agent Reinforcement Learning with Policy-Embedded Training

Centralised training (CT) is the basis for many popular multi-agent rein...
research
10/15/2020

Cooperative-Competitive Reinforcement Learning with History-Dependent Rewards

Consider a typical organization whose worker agents seek to collectively...
research
09/19/2018

Deterministic limit of temporal difference reinforcement learning for stochastic games

Reinforcement learning in multi-agent systems has been studied in the fi...

Please sign up or login with your details

Forgot password? Click here to reset