Controlling the False Discovery Rate via Knockoff for High Dimensional Ising Model Variable Selection
In high dimensional data analysis, it is important to effectively control the fraction of false discoveries and ensure large enough power for variable selection. In a lot of contemporary data applications, a large set of covariates are discrete variables. In this paper we propose Ising knockoff (IKF) for variable selection in high dimensional regression with discrete covariates. Under mild conditions, we show that the false discovery rate (FDR) is controlled under a target level in a finite sample if the underlying Ising model is known, and the FDR is asymptotically controlled even when the parameters of the Ising model is estimated. We also provide theoretical results on the power for our proposed method. Using simulations and a genome science data set, we show that IKF has higher power than existing knockoff procedures mostly tailored for continuous covariate distributions.
READ FULL TEXT