We study the trade-off between expectation and tail risk for regret
dist...
We design new policies that ensure both worst-case optimality for expect...
In this paper, we propose a PAC-Bayesian a posteriori parameter
selectio...
We consider the offline reinforcement learning problem, where the aim is...
We consider a seller offering a large network of N products over a time
...
Motivated by emerging applications such as live-streaming e-commerce,
pr...
We develop novel learning rates for conditional mean embeddings by apply...
We consider model-free reinforcement learning (RL) in non-stationary Mar...
In the classical multi-armed bandit problem, instance-dependent algorith...
The prevalence of e-commerce has made detailed customers' personal
infor...
In switchback experiments, a firm sequentially exposes an experimental u...
We propose two new Q-learning algorithms, Full-Q-Learning (FQL) and
Elim...
We consider un-discounted reinforcement learning (RL) in Markov decision...
We study in this paper a revenue management problem with add-on discount...
We consider the general (stochastic) contextual bandit problem under the...
This work is motivated by a practical concern from our retail partner. W...
This paper investigates the impact of pre-existing offline data on onlin...
We consider an assortment optimization problem where a customer chooses ...
We study an online knapsack problem where the items arrive sequentially ...
We propose algorithms with state-of-the-art dynamic regret bounds for
un...
We consider the classical stochastic multi-armed bandit problem with a
c...
Motivated by the dynamic assortment offerings and item pricings occurrin...
The recent rising popularity of ultra-fast delivery services on retail
p...
We introduce general data-driven decision-making algorithms that achieve...
We study the problem of learning across a sequence of price
experiments ...
Classically, the time complexity of a first-order method is estimated by...
This work is motivated by our collaboration with a large Consumer Packag...
In this paper we study the single-leg revenue management problem, with n...
We study a general problem of allocating limited resources to heterogene...
We introduce algorithms that achieve state-of-the-art dynamic regret
bou...
Randomized experiments have been critical tools of decision making for
d...
Randomized experiments have been used to assist decision-making in many
...