Experimental Evaluation of Individualized Treatment Rules
In recent years, the increasing availability of individual-level data and the advancement of machine learning algorithms have led to the explosion of methodological development for finding optimal individualized treatment rules (ITRs). These new tools are being applied in a variety of fields including business, medicine, and politics. However, there exist few methods that empirically evaluate the efficacy of ITRs. In particular, many of the existing ITR estimators are based on complex models and do not come with statistical uncertainty estimates. We consider common real-world settings, in which policy makers wish to predict the performance of a given ITR prior to its administration in a target population. We propose to use a randomized experiment for evaluating ITRs. Unlike the existing methods, the proposed methodology is based on Neyman's repeated sampling approach and does not require modeling assumptions. As a result, it is applicable to the empirical evaluation of ITRs derived from a wide range of statistical and machine learning models. We conduct a simulation study to demonstrate the accuracy of the proposed methodology in small samples. We also apply our methods to the Project STAR (Student-Teacher Achievement Ratio) experiment to compare the performance of ITRs that are based on popular machine learning methods used for estimating heterogeneous treatment effects.
READ FULL TEXT