Many organizations measure treatment effects via an experimentation plat...
We study the regret of reinforcement learning from offline data generate...
Incorporating side observations of predictive features can help reduce
u...
Dynamic treatment regimes (DTRs) for are personalized, sequential treatm...
We study a nonparametric contextual bandit problem where the expected re...