Online Batch Decision-Making with High-Dimensional Covariates

02/21/2020
by   Chi-Hua Wang, et al.
0

We propose and investigate a class of new algorithms for sequential decision making that interacts with a batch of users simultaneously instead of a user at each decision epoch. This type of batch models is motivated by interactive marketing and clinical trial, where a group of people are treated simultaneously and the outcomes of the whole group are collected before the next stage of decision. In such a scenario, our goal is to allocate a batch of treatments to maximize treatment efficacy based on observed high-dimensional user covariates. We deliver a solution, named Teamwork LASSO Bandit algorithm, that resolves a batch version of explore-exploit dilemma via switching between teamwork stage and selfish stage during the whole decision process. This is made possible based on statistical properties of LASSO estimate of treatment efficacy that adapts to a sequence of batch observations. In general, a rate of optimal allocation condition is proposed to delineate the exploration and exploitation trade-off on the data collection scheme, which is sufficient for LASSO to identify the optimal treatment for observed user covariates. An upper bound on expected cumulative regret of the proposed algorithm is provided.

READ FULL TEXT
research
07/22/2022

High dimensional stochastic linear contextual bandit with missing covariates

Recent works in bandit problems adopted lasso convergence theory in the ...
research
06/02/2021

Parallelizing Thompson Sampling

How can we make use of information parallelism in online decision making...
research
05/06/2020

DTR Bandit: Learning to Make Response-Adaptive Decisions With Low Regret

Dynamic treatment regimes (DTRs) for are personalized, sequential treatm...
research
04/10/2019

Active Learning for Decision-Making from Imbalanced Observational Data

Machine learning can help personalized decision support by learning mode...
research
07/03/2020

Developing a predictive signature for two trial endpoints using the cross-validated risk scores method

The existing cross-validated risk scores (CVRS) design has been proposed...
research
05/10/2021

Model-Assisted Uniformly Honest Inference for Optimal Treatment Regimes in High Dimension

This paper develops new tools to quantify uncertainty in optimal decisio...
research
07/29/2016

The Phylogenetic LASSO and the Microbiome

Scientific investigations that incorporate next generation sequencing in...

Please sign up or login with your details

Forgot password? Click here to reset