Linear Bandits on Uniformly Convex Sets

03/10/2021
∙
by   Thomas Kerdreux, et al.
∙
0
∙

Linear bandit algorithms yield 𝒊Ėƒ(n√(T)) pseudo-regret bounds on compact convex action sets ð’Ķ⊂ℝ^n and two types of structural assumptions lead to better pseudo-regret bounds. When ð’Ķ is the simplex or an ℓ_p ball with p∈]1,2], there exist bandits algorithms with 𝒊Ėƒ(√(nT)) pseudo-regret bounds. Here, we derive bandit algorithms for some strongly convex sets beyond ℓ_p balls that enjoy pseudo-regret bounds of 𝒊Ėƒ(√(nT)), which answers an open question from [BCB12, 5.5.]. Interestingly, when the action set is uniformly convex but not necessarily strongly convex, we obtain pseudo-regret bounds with a dimension dependency smaller than 𝒊(√(n)). However, this comes at the expense of asymptotic rates in T varying between 𝒊Ėƒ(√(T)) and 𝒊Ėƒ(T).

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset