High Dimensional Robust Estimation of Sparse Models via Trimmed Hard Thresholding

01/24/2019
by   Liu Liu, et al.
0

We study the problem of sparsity constrained M-estimation with arbitrary corruptions to both explanatory and response variables in the high-dimensional regime, where the number of variables d is larger than the sample size n. Our main contribution is a highly efficient gradient-based optimization algorithm that we call Trimmed Hard Thresholding -- a robust variant of Iterative Hard Thresholding (IHT) by using trimmed mean in gradient computations. Our algorithm can deal with a wide class of sparsity constrained M-estimation problems, and we can tolerate a nearly dimension independent fraction of arbitrarily corrupted samples. More specifically, when the corrupted fraction satisfies ϵ≲1 /(√(k) (nd)), where k is the sparsity of the parameter, we obtain accurate estimation and model selection guarantees with optimal sample complexity. Furthermore, we extend our algorithm to sparse Gaussian graphical model (precision matrix) estimation via a neighborhood selection approach. We demonstrate the effectiveness of robust estimation in sparse linear, logistic regression, and sparse precision matrix estimation on synthetic and real-world US equities data.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset