M-estimation with the Trimmed l1 Penalty
We study high-dimensional M-estimators with the trimmed ℓ_1 penalty. While standard ℓ_1 penalty incurs bias (shrinkage), trimmed ℓ_1 leaves the h largest entries penalty-free. This family of estimators include the Trimmed Lasso for sparse linear regression and its counterpart for sparse graphical model estimation. The trimmed ℓ_1 penalty is non-convex, but unlike other non-convex regularizers such as SCAD and MCP, it is not amenable and therefore prior analyzes cannot be applied. We characterize the support recovery of the estimates as a function of the trimming parameter h. Under certain conditions, we show that for any local optimum, (i) if the trimming parameter h is smaller than the true support size, all zero entries of the true parameter vector are successfully estimated as zero, and (ii) if h is larger than the true support size, the non-relevant parameters of the local optimum have smaller absolute values than relevant parameters and hence relevant parameters are not penalized. We then bound the ℓ_2 error of any local optimum. These bounds are asymptotically comparable to those for non-convex amenable penalties such as SCAD or MCP, but enjoy better constants. We specialize our main results to linear regression and graphical model estimation. Finally, we develop a fast provably convergent optimization algorithm for the trimmed regularizer problem. The algorithm has the same rate of convergence as difference of convex (DC)-based approaches, but is faster in practice and finds better objective values than recently proposed algorithms for DC optimization. Empirical results further demonstrate the value of ℓ_1 trimming.
READ FULL TEXT