Evaluating Agents without Rewards

12/21/2020
by   Brendon Matusch, et al.
17

Reinforcement learning has enabled agents to solve challenging tasks in unknown environments. However, manually crafting reward functions can be time consuming, expensive, and error prone to human error. Competing objectives have been proposed for agents to learn without external supervision, but it has been unclear how well they reflect task rewards or human behavior. To accelerate the development of intrinsic objectives, we retrospectively compute potential objectives on pre-collected datasets of agent behavior, rather than optimizing them online, and compare them by analyzing their correlations. We study input entropy, information gain, and empowerment across seven agents, three Atari games, and the 3D game Minecraft. We find that all three intrinsic objectives correlate more strongly with a human behavior similarity metric than with task reward. Moreover, input entropy and information gain correlate more strongly with human similarity than task reward does, suggesting the use of intrinsic objectives for designing agents that behave similarly to human players.

READ FULL TEXT

page 2

page 3

page 6

page 7

research
04/17/2018

On Learning Intrinsic Rewards for Policy Gradient Methods

In many sequential decision making tasks, it is challenging to design re...
research
05/21/2017

Experience enrichment based task independent reward model

For most reinforcement learning approaches, the learning is performed by...
research
05/12/2019

Mega-Reward: Achieving Human-Level Play without Extrinsic Rewards

Intrinsic rewards are introduced to simulate how human intelligence work...
research
03/16/2021

Learning to Shape Rewards using a Game of Switching Controls

Reward shaping (RS) is a powerful method in reinforcement learning (RL) ...
research
12/05/2019

Learning Human Objectives by Evaluating Hypothetical Behavior

We seek to align agent behavior with a user's objectives in a reinforcem...
research
09/14/2021

Benchmarking the Spectrum of Agent Capabilities

Evaluating the general abilities of intelligent agents requires complex ...
research
09/03/2020

Action and Perception as Divergence Minimization

We introduce a unified objective for action and perception of intelligen...

Please sign up or login with your details

Forgot password? Click here to reset