Influence-based Reinforcement Learning for Intrinsically-motivated Agents

by   Ammar Fayad, et al.

The reinforcement learning (RL) research area is very active, with several important applications. However, certain challenges still need to be addressed, amongst which one can mention the ability to find policies that achieve sufficient exploration and coordination while solving a given task. In this work, we present an algorithmic framework of two RL agents each with a different objective. We introduce a novel function approximation approach to assess the influence F of a certain policy on others. While optimizing F as a regularizer of π's objective, agents learn to coordinate team behavior while exploiting high-reward regions of the solution space. Additionally, both agents use prediction error as intrinsic motivation to learn policies that behave as differently as possible, thus achieving the exploration criterion. Our method was evaluated on the suite of OpenAI gym tasks as well as cooperative and mixed scenarios, where agent populations are able to discover various physical and informational coordination strategies, showing state-of-the-art performance when compared to famous baselines.


page 1

page 2

page 3

page 4


Influence-Based Multi-Agent Exploration

Intrinsically motivated reinforcement learning aims to address the explo...

Reliably Re-Acting to Partner's Actions with the Social Intrinsic Motivation of Transfer Empowerment

We consider multi-agent reinforcement learning (MARL) for cooperative co...

Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning

We propose a unified mechanism for achieving coordination and communicat...

Influencing Long-Term Behavior in Multiagent Reinforcement Learning

The main challenge of multiagent reinforcement learning is the difficult...

Reinforcement Learning with Intrinsic Affinity for Personalized Asset Management

The common purpose of applying reinforcement learning (RL) to asset mana...

Noisy Agents: Self-supervised Exploration by Predicting Auditory Events

Humans integrate multiple sensory modalities (e.g. visual and audio) to ...

Efficient Online Estimation of Empowerment for Reinforcement Learning

Training artificial agents to acquire desired skills through model-free ...

Please sign up or login with your details

Forgot password? Click here to reset