A Practically Competitive and Provably Consistent Algorithm for Uplift Modeling
Randomized experiments have been critical tools of decision making for decades. However, subjects can show significant heterogeneity in response to treatments in many important applications. Therefore it is not enough to simply know which treatment is optimal for the entire population. What we need is a model that correctly customize treatment assignment base on subject characteristics. The problem of constructing such models from randomized experiments data is known as Uplift Modeling in the literature. Many algorithms have been proposed for uplift modeling and some have generated promising results on various data sets. Yet little is known about the theoretical properties of these algorithms. In this paper, we propose a new tree-based ensemble algorithm for uplift modeling. Experiments show that our algorithm can achieve competitive results on both synthetic and industry-provided data. In addition, by properly tuning the "node size" parameter, our algorithm is proved to be consistent under mild regularity conditions. This is the first consistent algorithm for uplift modeling that we are aware of.
READ FULL TEXT