Optimal Rates for Bandit Nonstochastic Control

05/24/2023
by   Y. Jennifer Sun, et al.
0

Linear Quadratic Regulator (LQR) and Linear Quadratic Gaussian (LQG) control are foundational and extensively researched problems in optimal control. We investigate LQR and LQG problems with semi-adversarial perturbations and time-varying adversarial bandit loss functions. The best-known sublinear regret algorithm of <cit.> has a T^3/4 time horizon dependence, and its authors posed an open question about whether a tight rate of √(T) could be achieved. We answer in the affirmative, giving an algorithm for bandit LQR and LQG which attains optimal regret (up to logarithmic factors) for both known and unknown systems. A central component of our method is a new scheme for bandit convex optimization with memory, which is of independent interest.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/12/2020

Non-Stochastic Control with Bandit Feedback

We study the problem of controlling a linear dynamical system with adver...
research
11/27/2019

The Nonstochastic Control Problem

We consider the problem of controlling an unknown linear dynamical syste...
research
10/25/2020

Geometric Exploration for Online Control

We study the control of an unknown linear dynamical system under general...
research
09/13/2018

Algorithms for Optimal Control with Fixed-Rate Feedback

We consider a discrete-time linear quadratic Gaussian networked control ...
research
02/29/2020

Logarithmic Regret for Adversarial Online Control

We introduce a new algorithm for online linear-quadratic control in a kn...
research
05/18/2018

Projection-Free Bandit Convex Optimization

In this paper, we propose the first computationally efficient projection...
research
08/02/2022

Unimodal Mono-Partite Matching in a Bandit Setting

We tackle a new emerging problem, which is finding an optimal monopartit...

Please sign up or login with your details

Forgot password? Click here to reset