Causal Decision Trees
Uncovering causal relationships in data is a major objective of data analytics. Causal relationships are normally discovered with designed experiments, e.g. randomised controlled trials, which, however are expensive or infeasible to be conducted in many cases. Causal relationships can also be found using some well designed observational studies, but they require domain experts' knowledge and the process is normally time consuming. Hence there is a need for scalable and automated methods for causal relationship exploration in data. Classification methods are fast and they could be practical substitutes for finding causal signals in data. However, classification methods are not designed for causal discovery and a classification method may find false causal signals and miss the true ones. In this paper, we develop a causal decision tree where nodes have causal interpretations. Our method follows a well established causal inference framework and makes use of a classic statistical test. The method is practical for finding causal signals in large data sets.
READ FULL TEXT