Online Nonsubmodular Minimization with Delayed Costs: From Full Information to Bandit Feedback

by   Tianyi Lin, et al.

Motivated by applications to online learning in sparse estimation and Bayesian optimization, we consider the problem of online unconstrained nonsubmodular minimization with delayed costs in both full information and bandit feedback settings. In contrast to previous works on online unconstrained submodular minimization, we focus on a class of nonsubmodular functions with special structure, and prove regret guarantees for several variants of the online and approximate online bandit gradient descent algorithms in static and delayed scenarios. We derive bounds for the agent's regret in the full information and bandit feedback setting, even if the delay between choosing a decision and receiving the incurred cost is unbounded. Key to our approach is the notion of (α, β)-regret and the extension of the generic convex relaxation model from <cit.>, the analysis of which is of independent interest. We conduct and showcase several simulation studies to demonstrate the efficacy of our algorithms.


page 1

page 2

page 3

page 4


Online Continuous Submodular Maximization: From Full-Information to Bandit Feedback

In this paper, we propose three online algorithms for submodular maximis...

Online Boosting with Bandit Feedback

We consider the problem of online boosting for regression tasks, when on...

Delayed Bandit Online Learning with Unknown Delays

This paper studies bandit learning problems with delayed feedback, which...

Differentially Private Online Submodular Optimization

In this paper we develop the first algorithms for online submodular mini...

Understanding the Role of Feedback in Online Learning with Switching Costs

In this paper, we study the role of feedback in online learning with swi...

Trading Off Resource Budgets for Improved Regret Bounds

In this work we consider a variant of adversarial online learning where ...

Online Sign Identification: Minimization of the Number of Errors in Thresholding Bandits

In the fixed budget thresholding bandit problem, an algorithm sequential...

Please sign up or login with your details

Forgot password? Click here to reset