Neural Laplace Control for Continuous-time Delayed Systems

02/24/2023
by   Samuel Holt, et al.
0

Many real-world offline reinforcement learning (RL) problems involve continuous-time environments with delays. Such environments are characterized by two distinctive features: firstly, the state x(t) is observed at irregular time intervals, and secondly, the current action a(t) only affects the future state x(t + g) with an unknown delay g > 0. A prime example of such an environment is satellite control where the communication link between earth and a satellite causes irregular observations and delays. Existing offline RL algorithms have achieved success in environments with irregularly observed states in time or known delays. However, environments involving both irregular observations in time and unknown delays remains an open and challenging problem. To this end, we propose Neural Laplace Control, a continuous-time model-based offline RL method that combines a Neural Laplace dynamics model with a model predictive control (MPC) planner–and is able to learn from an offline dataset sampled with irregular time intervals from an environment that has a inherent unknown constant delay. We show experimentally on continuous-time delayed environments it is able to achieve near expert policy performance.

READ FULL TEXT
research
02/09/2021

Continuous-Time Model-Based Reinforcement Learning

Model-based reinforcement learning (MBRL) approaches rely on discrete-ti...
research
08/17/2021

Revisiting State Augmentation methods for Reinforcement Learning with Stochastic Delays

Several real-world scenarios, such as remote control and sensing, are co...
research
06/01/2020

Temporal-Differential Learning in Continuous Environments

In this paper, a new reinforcement learning (RL) method known as the met...
research
06/01/2023

IQL-TD-MPC: Implicit Q-Learning for Hierarchical Model Predictive Control

Model-based reinforcement learning (RL) has shown great promise due to i...
research
09/20/2023

Delays in Reinforcement Learning

Delays are inherent to most dynamical systems. Besides shifting the proc...
research
05/03/2023

Predictive Wand: a mathematical interface design for operations with delays

Action-feedback delay during operation reduces both task performance and...

Please sign up or login with your details

Forgot password? Click here to reset