Error Asymmetry in Causal and Anticausal Regression
It is generally difficult to make any statements about the expected prediction error in an univariate setting without further knowledge about how the data were generated. Recent work showed that knowledge about the real underlying causal structure of a data generation process has implications for various machine learning settings. Assuming an additive noise and an independence between data generating mechanism and its input, we draw a novel connection between the intrinsic causal relationship of two variables and the expected prediction error. We formulate the theorem that the expected error of the true data generating function as prediction model is generally smaller when the effect is predicted from its cause and, on the contrary, greater when the cause is predicted from its effect. The theorem implies an asymmetry in the error depending on the prediction direction. This is further corroborated with empirical evaluations in artificial and real-world data sets.
READ FULL TEXT