A major challenge in reinforcement learning is to develop practical,
sam...
A central problem in the theory of multi-agent reinforcement learning (M...
We consider the development of adaptive, instance-dependent algorithms f...
We study the design of sample-efficient algorithms for reinforcement lea...
We consider the problem of decentralized multi-agent reinforcement learn...
A foundational problem in reinforcement learning and interactive decisio...
We consider the problem of interactive decision making, encompassing
str...
Coverage conditions – which assert that the data logging distribution
ad...
A central problem in sequential decision making is to develop algorithms...
A central problem in online learning and decision making – from bandits ...
Consider the problem setting of Interaction-Grounded Learning (IGL), in ...
In real-world reinforcement learning applications the learner's observat...
A fundamental challenge in interactive learning and decision making, ran...
We consider the offline reinforcement learning problem, where the aim is...
We consider the task of estimating a conditional density using i.i.d. sa...
A major research direction in contextual bandits is to develop algorithm...
A recurring theme in statistical learning, online learning, and beyond i...
We study the relationship between the eluder dimension for a function cl...
We obtain global, non-asymptotic convergence guarantees for independent
...
We introduce a new problem setting for continuous control called the LQR...
In the classical multi-armed bandit problem, instance-dependent algorith...
We consider the classical problem of sequential probability assignment u...
We design an algorithm which finds an ϵ-approximate stationary point
(wi...
In statistical learning, algorithms for model selection allow the learne...
We introduce algorithms for learning nonlinear dynamical systems of the ...
We introduce a new algorithm for online linear-quadratic control in a kn...
A fundamental challenge in contextual bandits is to develop flexible,
ge...
We consider the problem of online adaptive control of the linear quadrat...
We lower bound the complexity of finding ϵ-stationary points (with
gradi...
We show that the Rademacher complexity of any R^K-valued
function class ...
We introduce the problem of model selection for contextual bandits, wher...
We study tensor completion in the agnostic setting. In the classical ten...
We present an extensive study of generalization for data-dependent hypot...
In distributed statistical learning, N samples are split across m
machin...
We provide excess risk guarantees for statistical learning in the presen...
We investigate 1) the rate at which refined properties of the empirical
...
We introduce a new family of margin-based regret guarantees for adversar...
Learning linear predictors with the logistic loss---both in stochastic a...
We uncover a fairly general principle in online learning: If regret can ...
A major challenge in contextual bandits is to design general-purpose
alg...
We introduce an efficient algorithmic framework for model selection in o...
This paper presents a margin-based multiclass generalization bound for n...
We develop a novel family of algorithms for the online learning setting ...
We propose a general framework for studying adaptive regret bounds in th...