Julian Zimmert

research

∙ 09/02/2023

Bypassing the Simulator: Near-Optimal Adversarial Linear Contextual Bandits

We consider the adversarial linear contextual bandit problem, where the ...

0 Haolin Liu, et al. ∙

research

∙ 08/21/2023

An Improved Best-of-both-worlds Algorithm for Bandits with Delayed Feedback

We propose a new best-of-both-worlds algorithm for bandits with variably...

0 Saeed Masoudian, et al. ∙

research

∙ 02/20/2023

A Blackbox Approach to Best of Both Worlds in Bandits and Beyond

Best-of-both-worlds algorithms for online learning which achieve near-op...

0 Christoph Dann, et al. ∙

research

∙ 02/18/2023

Best of Both Worlds Policy Optimization

Policy optimization methods are popular reinforcement learning algorithm...

0 Christoph Dann, et al. ∙

research

∙ 10/17/2022

A Unified Algorithm for Stochastic Path Problems

We study reinforcement learning in stochastic path (SP) problems. The go...

0 Christoph Dann, et al. ∙

research

∙ 06/29/2022

A Best-of-Both-Worlds Algorithm for Bandits with Delayed Feedback

We present a modified tuning of the algorithm of Zimmert and Seldin [202...

0 Saeed Masoudian, et al. ∙

research

∙ 06/20/2022

Stochastic Online Learning with Feedback Graphs: Finite-Time and Asymptotic Optimality

We revisit the problem of stochastic online learning with feedback graph...

0 Teodor V. Marinov, et al. ∙

research

∙ 02/06/2022

Pushing the Efficiency-Regret Pareto Frontier for Online Learning of Portfolios and Quantum States

We revisit the classical online portfolio selection problem. It is widel...

2 Julian Zimmert, et al. ∙

research

∙ 10/25/2021

The Pareto Frontier of model selection for general Contextual Bandits

Recent progress in model selection raises the question of the fundamenta...

0 Teodor V. Marinov, et al. ∙

research

∙ 10/07/2021

A Model Selection Approach for Corruption Robust Reinforcement Learning

We develop a model selection approach to tackle reinforcement learning w...

0 Chen-Yu Wei, et al. ∙

research

∙ 10/06/2021

Efficient Methods for Online Multiclass Logistic Regression

Multiclass logistic regression is a fundamental task in machine learning...

0 Naman Agarwal, et al. ∙

research

∙ 07/12/2021

Adapting to Misspecification in Contextual Bandits

A major research direction in contextual bandits is to develop algorithm...

7 Dylan J. Foster, et al. ∙

research

∙ 07/02/2021

Beyond Value-Function Gaps: Improved Instance-Dependent Regret Bounds for Episodic Reinforcement Learning

We provide improved gap-dependent regret bounds for reinforcement learni...

6 Christoph Dann, et al. ∙

research

∙ 03/03/2020

Model Selection in Contextual Stochastic Bandit Problems

We study model selection in stochastic bandit problems. Our approach rel...

0 Aldo Pacchiano, et al. ∙

research

∙ 02/27/2020

Online Learning for Active Cache Synchronization

Existing multi-armed bandit (MAB) models make two implicit assumptions: ...

7 Andrey Kolobov, et al. ∙

research

∙ 10/14/2019

An Optimal Algorithm for Adversarial Bandits with Arbitrary Delays

We propose a new algorithm for adversarial multi-armed bandits with unre...

0 Julian Zimmert, et al. ∙

research

∙ 05/28/2019

Connections Between Mirror Descent, Thompson Sampling and the Information Ratio

The information-theoretic analysis by Russo and Van Roy (2014) in combin...

0 Julian Zimmert, et al. ∙

research

∙ 01/25/2019

Beating Stochastic and Adversarial Semi-bandits Optimally and Simultaneously

We develop the first general semi-bandit algorithm that simultaneously a...

0 Julian Zimmert, et al. ∙

research

∙ 07/19/2018

An Optimal Algorithm for Stochastic and Adversarial Bandits

We provide an algorithm that achieves the optimal (up to constants) fini...

0 Julian Zimmert, et al. ∙

research

∙ 07/04/2018

Factored Bandits

We introduce the factored bandits model, which is a framework for learni...

0 Julian Zimmert, et al. ∙

Julian Zimmert

Featured Co-authors

Sign in with Google

Consider DeepAI Pro