Efficient Optimistic Exploration in Linear-Quadratic Regulators via Lagrangian Relaxation

by   Marc Abeille, et al.

We study the exploration-exploitation dilemma in the linear quadratic regulator (LQR) setting. Inspired by the extended value iteration algorithm used in optimistic algorithms for finite MDPs, we propose to relax the optimistic optimization of and cast it into a constrained extended LQR problem, where an additional control variable implicitly selects the system dynamics within a confidence interval. We then move to the corresponding Lagrangian formulation for which we prove strong duality. As a result, we show that an ϵ-optimistic controller can be computed efficiently by solving at most O(log(1/ϵ)) Riccati equations. Finally, we prove that relaxing the original problem does not impact the learning performance, thus recovering the Õ(√(T)) regret of . To the best of our knowledge, this is the first computationally efficient confidence-based algorithm for LQR with worst-case optimal regret guarantees.


page 1

page 2

page 3

page 4


Learning Linear-Quadratic Regulators Efficiently with only √(T) Regret

We present the first computationally-efficient algorithm with O(√(T)) r...

Provably Efficient Exploration in Quantum Reinforcement Learning with Logarithmic Worst-Case Regret

While quantum reinforcement learning (RL) has attracted a surge of atten...

An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits

In the contextual linear bandit setting, algorithms built on the optimis...

A Meta-Learning Control Algorithm with Provable Finite-Time Guarantees

In this work we provide provable regret guarantees for an online meta-le...

Learning Controllers for Unstable Linear Quadratic Regulators from a Single Trajectory

We present the first approach for learning – from a single trajectory – ...

Resilient Constrained Learning

When deploying machine learning solutions, they must satisfy multiple re...

Differential Flatness and Flatness Inspired Control of Aerial Manipulators based on Lagrangian Reduction

This paper shows that the dynamics of a general class of aerial manipula...

Please sign up or login with your details

Forgot password? Click here to reset