When the Oracle Misleads: Modeling the Consequences of Using Observable Rather than Potential Outcomes in Risk Assessment Instruments
Machine learning-based Risk Assessment Instruments are increasingly widely used to support decision making in domains as diverse as healthcare, criminal justice, and consumer finance. When the decision maker’s goal is to reduce the risk of the predicted outcome, they are naturally concerned with potential outcomes. However, RAIs are typically designed to predict outcomes under the historical decision process that generated the data, making them unsuitable for helping decision makers choose among different courses of action. While many of the limitations of such RAIs have been recognized, there does not appear to be a general mathematical model that provides insight into how and why such RAIs can lead users astray. In this work, we aim to fill this gap, showing how RAIs based on observable outcomes can lead to worse outcomes, i.e., more severe departures from an optimal treatment regime, than before the RAI was introduced. This has nothing to do with the quality of prediction; it can occur even when (1) the oracle predictor is available and (2) there is no unmeasured confounding. We describe several dangerous properties of these RAIs and illustrate their suboptimality with a simple example.
READ FULL TEXT