Improved Regret Bounds for Tracking Experts with Memory

06/24/2021
by   James Robinson, et al.
0

We address the problem of sequential prediction with expert advice in a non-stationary environment with long-term memory guarantees in the sense of Bousquet and Warmuth [4]. We give a linear-time algorithm that improves on the best known regret bounds [26]. This algorithm incorporates a relative entropy projection step. This projection is advantageous over previous weight-sharing approaches in that weight updates may come with implicit costs as in for example portfolio optimization. We give an algorithm to compute this projection step in linear time, which may be of independent interest.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/08/2023

Non-Stationary Bandits with Knapsack Problems with Advice

We consider a non-stationary Bandits with Knapsack problem. The outcome ...
research
05/30/2019

Equipping Experts/Bandits with Long-term Memory

We propose the first reduction-based approach to obtaining long-term mem...
research
02/15/2012

Mirror Descent Meets Fixed Share (and feels no regret)

Mirror descent with an entropic regularizer is known to achieve shifting...
research
03/07/2019

A Rank-1 Sketch for Matrix Multiplicative Weights

We show that a simple randomized sketch of the matrix multiplicative wei...
research
03/09/2021

Regret Bounds for Generalized Linear Bandits under Parameter Drift

Generalized Linear Bandits (GLBs) are powerful extensions to the Linear ...
research
03/04/2011

Adapting to Non-stationarity with Growing Expert Ensembles

When dealing with time series with complex non-stationarities, low retro...
research
02/17/2016

Online optimization and regret guarantees for non-additive long-term constraints

We consider online optimization in the 1-lookahead setting, where the ob...

Please sign up or login with your details

Forgot password? Click here to reset