Estimating Q(s,s') with Deep Deterministic Dynamics Gradients

02/21/2020
by   Ashley D. Edwards, et al.
7

In this paper, we introduce a novel form of value function, Q(s, s'), that expresses the utility of transitioning from a state s to a neighboring state s' and then acting optimally thereafter. In order to derive an optimal policy, we develop a forward dynamics model that learns to make next-state predictions that maximize this value. This formulation decouples actions from values while still learning off-policy. We highlight the benefits of this approach in terms of value function transfer, learning within redundant action spaces, and learning off-policy from state observations generated by sub-optimal or completely random policies. Code and videos are available at <sites.google.com/view/qss-paper>.

READ FULL TEXT

page 2

page 7

research
05/25/2021

Robust Value Iteration for Continuous Control Tasks

When transferring a control policy from simulation to a physical system,...
research
10/05/2021

Continuous-Time Fitted Value Iteration for Robust Policies

Solving the Hamilton-Jacobi-Bellman equation is important in many domain...
research
10/09/2019

Imagined Value Gradients: Model-Based Policy Optimization with Transferable Latent Dynamics Models

Humans are masters at quickly learning many complex tasks, relying on an...
research
04/18/2019

Interplanetary Transfers via Deep Representations of the Optimal Policy and/or of the Value Function

A number of applications to interplanetary trajectories have been recent...
research
03/29/2022

Neural representation of a time optimal, constant acceleration rendezvous

We train neural models to represent both the optimal policy (i.e. the op...
research
02/20/2023

Improving Deep Policy Gradients with Value Function Search

Deep Policy Gradient (PG) algorithms employ value networks to drive the ...
research
02/20/2020

Real-Time Optimal Guidance and Control for Interplanetary Transfers Using Deep Networks

We consider the Earth-Venus mass-optimal interplanetary transfer of a lo...

Please sign up or login with your details

Forgot password? Click here to reset