Influence-based Reinforcement Learning for Intrinsically-motivated Agents

08/28/2021

∙

The reinforcement learning (RL) research area is very active, with several important applications. However, certain challenges still need to be addressed, amongst which one can mention the ability to find policies that achieve sufficient exploration and coordination while solving a given task. In this work, we present an algorithmic framework of two RL agents each with a different objective. We introduce a novel function approximation approach to assess the influence F of a certain policy on others. While optimizing F as a regularizer of π's objective, agents learn to coordinate team behavior while exploiting high-reward regions of the solution space. Additionally, both agents use prediction error as intrinsic motivation to learn policies that behave as differently as possible, thus achieving the exploration criterion. Our method was evaluated on the suite of OpenAI gym tasks as well as cooperative and mixed scenarios, where agent populations are able to discover various physical and informational coordination strategies, showing state-of-the-art performance when compared to famous baselines.

READ FULL TEXT

Influence-based Reinforcement Learning for Intrinsically-motivated Agents

Sign in with Google

Consider DeepAI Pro