Non-Episodic Learning for Online LQR of Unknown Linear Gaussian System

03/24/2021
by   Yiwen Lu, et al.
0

This paper considers the data-driven linear-quadratic regulation (LQR) problem where the system parameters are unknown and need to be identified in real time. Contrary to existing system identification and data-driven control methods, which typically require either offline data collection or multiple resets, we propose an online non-episodic algorithm that gains knowledge about the system from a single trajectory. The algorithm guarantees that both the identification error and the suboptimality gap of control performance in this trajectory converge to zero almost surely. Furthermore, we characterize the almost sure convergence rates of identification and control, and reveal an optimal trade-off between exploration and exploitation. We provide a numerical example to illustrate the effectiveness of our proposed strategy.

READ FULL TEXT
research
08/16/2023

Online Control for Linear Dynamics: A Data-Driven Approach

This paper considers an online control problem over a linear time-invari...
research
05/29/2020

Online Regulation of Unstable LTI Systems from a Single Trajectory

Recently, data-driven methods for control of dynamic systems have receiv...
research
03/31/2023

Learning-Based Optimal Control with Performance Guarantees for Unknown Systems with Latent States

As control engineering methods are applied to increasingly complex syste...
research
06/17/2023

Non-asymptotic System Identification for Linear Systems with Nonlinear Policies

This paper considers a single-trajectory system identification problem f...
research
09/26/2018

Safely Learning to Control the Constrained Linear Quadratic Regulator

We study the constrained linear quadratic regulator with unknown dynamic...
research
01/20/2021

Active Model Learning using Informative Trajectories for Improved Closed-Loop Control on Real Robots

Model-based controllers on real robots require accurate knowledge of the...
research
03/04/2019

QuickStop: A Markov Optimal Stopping Approach for Quickest Misinformation Detection

This paper combines data-driven and model-driven methods for real-time m...

Please sign up or login with your details

Forgot password? Click here to reset