Optimal Reinforcement Learning for Gaussian Systems

06/04/2011
by   Philipp Hennig, et al.
0

The exploration-exploitation trade-off is among the central challenges of reinforcement learning. The optimal Bayesian solution is intractable in general. This paper studies to what extent analytic statements about optimal learning are possible if all beliefs are Gaussian processes. A first order approximation of learning of both loss and dynamics, for nonlinear, time-varying systems in continuous time and space, subject to a relatively weak restriction on the dynamics, is described by an infinite-dimensional partial differential equation. An approximate finite-dimensional projection gives an impression for how this result may be helpful.

READ FULL TEXT
research
10/02/2020

POMDPs in Continuous Time and Discrete Spaces

Many processes, such as discrete event systems in engineering or populat...
research
10/13/2015

Dual Control for Approximate Bayesian Reinforcement Learning

Control of non-episodic, finite-horizon dynamical systems with uncertain...
research
05/30/2023

Policy Optimization for Continuous Reinforcement Learning

We study reinforcement learning (RL) in the setting of continuous time a...
research
04/02/2021

Distributional Offline Continuous-Time Reinforcement Learning with Neural Physics-Informed PDEs (SciPhy RL for DOCTR-L)

This paper addresses distributional offline continuous-time reinforcemen...
research
12/16/2021

Multidimensional Projection Filters via Automatic Differentiation and Sparse-Grid Integration

The projection filter is a method for approximating the dynamics of cond...
research
07/11/2021

Statistical Estimation and Nonlinear Filtering in Environmental Pollution

This paper studies a nonlinear filtering problem over an infinite time i...

Please sign up or login with your details

Forgot password? Click here to reset