Efficient Optimistic Exploration in Linear-Quadratic Regulators via Lagrangian Relaxation

07/13/2020
by   Marc Abeille, et al.
0

We study the exploration-exploitation dilemma in the linear quadratic regulator (LQR) setting. Inspired by the extended value iteration algorithm used in optimistic algorithms for finite MDPs, we propose to relax the optimistic optimization of and cast it into a constrained extended LQR problem, where an additional control variable implicitly selects the system dynamics within a confidence interval. We then move to the corresponding Lagrangian formulation for which we prove strong duality. As a result, we show that an ϵ-optimistic controller can be computed efficiently by solving at most O(log(1/ϵ)) Riccati equations. Finally, we prove that relaxing the original problem does not impact the learning performance, thus recovering the Õ(√(T)) regret of . To the best of our knowledge, this is the first computationally efficient confidence-based algorithm for LQR with worst-case optimal regret guarantees.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/17/2019

Learning Linear-Quadratic Regulators Efficiently with only √(T) Regret

We present the first computationally-efficient algorithm with O(√(T)) r...
research
02/21/2023

Provably Efficient Exploration in Quantum Reinforcement Learning with Logarithmic Worst-Case Regret

While quantum reinforcement learning (RL) has attracted a surge of atten...
research
10/23/2020

An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits

In the contextual linear bandit setting, algorithms built on the optimis...
research
08/30/2020

A Meta-Learning Control Algorithm with Provable Finite-Time Guarantees

In this work we provide provable regret guarantees for an online meta-le...
research
06/19/2020

Learning Controllers for Unstable Linear Quadratic Regulators from a Single Trajectory

We present the first approach for learning – from a single trajectory – ...
research
06/04/2023

Resilient Constrained Learning

When deploying machine learning solutions, they must satisfy multiple re...
research
11/02/2021

Differential Flatness and Flatness Inspired Control of Aerial Manipulators based on Lagrangian Reduction

This paper shows that the dynamics of a general class of aerial manipula...

Please sign up or login with your details

Forgot password? Click here to reset