Statistical Inference for Algorithmic Leveraging
The age of big data has produced data sets that are computationally expensive to analyze. To deal with such large-scale data sets, the method of algorithmic leveraging proposes that we sample according to some special distribution, rescale the data, and then perform analysis on the smaller sample. Ma, Mahoney, and Yu (2015) provides a framework to determine the statistical properties of algorithmic leveraging in the context of estimating the regression coefficients in a linear model with a fixed number of predictors. In this paper, we discuss how to perform statistical inference on regression coefficients estimated using algorithmic leveraging. In particular, we show how to construct confidence intervals for each estimated coefficient and present an efficient algorithm for doing so when the error variance is known. Through simulations, we confirm that our procedure controls the type I errors of significance tests for the regression coefficients and show that it has good power for those tests.
READ FULL TEXT