Truncated Linear Regression in High Dimensions

07/29/2020
by   Constantinos Daskalakis, et al.
0

As in standard linear regression, in truncated linear regression, we are given access to observations (A_i, y_i)_i whose dependent variable equals y_i= A_i^ T· x^* + η_i, where x^* is some fixed unknown vector of interest and η_i is independent noise; except we are only given an observation if its dependent variable y_i lies in some "truncation set" S ⊂ℝ. The goal is to recover x^* under some favorable conditions on the A_i's and the noise distribution. We prove that there exists a computationally and statistically efficient method for recovering k-sparse n-dimensional vectors x^* from m truncated samples, which attains an optimal ℓ_2 reconstruction error of O(√((k log n)/m)). As a corollary, our guarantees imply a computationally efficient and information-theoretically optimal algorithm for compressed sensing with truncation, which may arise from measurement saturation effects. Our result follows from a statistical and computational analysis of the Stochastic Gradient Descent (SGD) algorithm for solving a natural adaptation of the LASSO optimization problem that accommodates truncation. This generalizes the works of both: (1) [Daskalakis et al. 2018], where no regularization is needed due to the low-dimensionality of the data, and (2) [Wainright 2009], where the objective function is simple due to the absence of truncation. In order to deal with both truncation and high-dimensionality at the same time, we develop new techniques that not only generalize the existing ones but we believe are of independent interest.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/25/2022

Efficient Truncated Linear Regression with Unknown Noise Variance

Truncated linear regression is a classical challenge in Statistics, wher...
research
10/22/2020

Computationally and Statistically Efficient Truncated Regression

We provide a computationally and statistically efficient estimator for t...
research
12/13/2014

The Statistics of Streaming Sparse Regression

We present a sparse analogue to stochastic gradient descent that is guar...
research
07/25/2017

Compressed Sparse Linear Regression

High-dimensional sparse linear regression is a basic problem in machine ...
research
05/10/2023

Computationally Efficient and Statistically Optimal Robust High-Dimensional Linear Regression

High-dimensional linear regression under heavy-tailed noise or outlier c...
research
07/11/2022

(Nearly) Optimal Private Linear Regression via Adaptive Clipping

We study the problem of differentially private linear regression where e...
research
07/05/2020

Efficient Parameter Estimation of Truncated Boolean Product Distributions

We study the problem of estimating the parameters of a Boolean product d...

Please sign up or login with your details

Forgot password? Click here to reset