Semi-Generative Modelling: Domain Adaptation with Cause and Effect Features
This paper presents a novel, causally-inspired approach to domain adaptation which aims to also include unlabelled data in the model fitting when labelled data is scarce. We consider a case of covariate-shift adaptation with cause and effect features, and--drawing from recent ideas in causal modelling and invariant prediction--show how this setting leads to, what we will refer to as, a semi-generative model: P(Y, X_eff; X_cau,θ). Our proposed approach is robust to changes in the distribution over causal features, and naturally allows to impose model constraints by unsupervised learning of a map from causes to effects. In experiments on synthetic datasets we demonstrate a significant improvement in classification performance of our semi-generative model over purely-supervised and importance-weighting baselines when the amount of labelled data is small. Moreover, we apply our approach for regression on real-world protein-count data and compare it to feature transformation methods.
READ FULL TEXT