Scalable agent alignment via reward modeling: a research direction

11/19/2018
by   Jan Leike, et al.
20

One obstacle to applying reinforcement learning algorithms to real-world problems is the lack of suitable reward functions. Designing such reward functions is difficult in part because the user only has an implicit understanding of the task objective. This gives rise to the agent alignment problem: how do we create agents that behave in accordance with the user's intentions? We outline a high-level research direction to solve the agent alignment problem centered around reward modeling: learning a reward function from interaction with the user and optimizing the learned reward function with reinforcement learning. We discuss the key challenges we expect to face when scaling reward modeling to complex and general domains, concrete approaches to mitigate these challenges, and ways to establish trust in the resulting agents.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/28/2020

Pitfalls of learning a reward function online

In some agent designs like inverse reinforcement learning an agent needs...
research
04/14/2023

Learning to Learn Group Alignment: A Self-Tuning Credo Framework with Multiagent Teams

Mixed incentives among a population with multiagent teams has been shown...
research
08/30/2023

Iterative Reward Shaping using Human Feedback for Correcting Reward Misspecification

A well-defined reward function is crucial for successful training of an ...
research
06/18/2018

A Survey of Inverse Reinforcement Learning: Challenges, Methods and Progress

Inverse reinforcement learning is the problem of inferring the reward fu...
research
05/25/2021

A Comparison of Reward Functions in Q-Learning Applied to a Cart Position Problem

Growing advancements in reinforcement learning has led to advancements i...
research
08/26/2020

Assessment of Reward Functions for Reinforcement Learning Traffic Signal Control under Real-World Limitations

Adaptive traffic signal control is one key avenue for mitigating the gro...
research
01/04/2019

Machine Teaching in Hierarchical Genetic Reinforcement Learning: Curriculum Design of Reward Functions for Swarm Shepherding

The design of reward functions in reinforcement learning is a human skil...

Please sign up or login with your details

Forgot password? Click here to reset