Learning Independently-Obtainable Reward Functions

by   Christopher Grimm, et al.

We present a novel method for learning a set of disentangled reward functions that sum to the original environment reward and are constrained to be independently achievable. We define independent achievability in terms of value functions with respect to achieving one learned reward while pursuing another learned reward. Empirically, we illustrate that our method can learn meaningful reward decompositions in a variety of domains and that these decompositions exhibit some form of generalization performance when the environment's reward is modified. Theoretically, we derive results about the effect of maximizing our method's objective on the resulting reward functions and their corresponding optimal policies.


page 4

page 6

page 7

page 8


Quantifying Differences in Reward Functions

For many tasks, the reward function is too complex to be specified proce...

Outcome-Driven Reinforcement Learning via Variational Inference

While reinforcement learning algorithms provide automated acquisition of...

Understanding Learned Reward Functions

In many real-world tasks, it is not possible to procedurally specify an ...

Shaping Proto-Value Functions via Rewards

In this paper, we combine task-dependent reward shaping and task-indepen...

Combining Reward Information from Multiple Sources

Given two sources of evidence about a latent variable, one can combine t...

Inverse Constraint Learning and Generalization by Transferable Reward Decomposition

We present the problem of inverse constraint learning (ICL), which recov...

Assisted Robust Reward Design

Real-world robotic tasks require complex reward functions. When we defin...

Please sign up or login with your details

Forgot password? Click here to reset