Gamma-Nets: Generalizing Value Estimation over Timescale

11/18/2019
by   Craig Sherstan, et al.
31

We present Γ-nets, a method for generalizing value function estimation over timescale. By using the timescale as one of the estimator's inputs we can estimate value for arbitrary timescales. As a result, the prediction target for any timescale is available and we are free to train on multiple timescales at each timestep. Here we empirically evaluate Γ-nets in the policy evaluation setting. We first demonstrate the approach on a square wave and then on a robot arm using linear function approximation. Next, we consider the deep reinforcement learning setting using several Atari video games. Our results show that Γ-nets can be effective for predicting arbitrary timescales, with only a small cost in accuracy as compared to learning estimators for fixed timescales. Γ-nets provide a method for compactly making predictions at many timescales without requiring a priori knowledge of the task, making it a valuable contribution to ongoing work on model-based planning, representation learning, and lifelong learning algorithms.

READ FULL TEXT

page 6

page 7

page 11

page 18

page 19

page 20

page 21

research
05/29/2022

Representation Gap in Deep Reinforcement Learning

Deep reinforcement learning gives the promise that an agent learns good ...
research
06/04/2022

Hybrid Value Estimation for Off-policy Evaluation and Offline Reinforcement Learning

Value function estimation is an indispensable subroutine in reinforcemen...
research
09/04/2019

Complexity of Computing the Shapley Value in Games with Externalities

We study the complexity of computing the Shapley value in games with ext...
research
10/25/2021

Self-Consistent Models and Values

Learned models of the environment provide reinforcement learning (RL) ag...
research
06/13/2012

Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping

We consider the problem of efficiently learning optimal control policies...
research
06/14/2016

Digits that are not: Generating new types through deep neural nets

For an artificial creative agent, an essential driver of the search for ...
research
11/19/2014

Compress and Control

This paper describes a new information-theoretic policy evaluation techn...

Please sign up or login with your details

Forgot password? Click here to reset