A General Framework of Online Updating Variable Selection for Generalized Linear Models with Streaming Datasets
In the research field of big data, one of important issues is how to recover the sequentially changing sets of true features when the data sets arrive sequentially. The paper presents a general framework for online updating variable selection and parameter estimation in generalized linear models with streaming datasets. This is a type of online updating penalty likelihoods with differentiable or non-differentiable penalty function. The online updating coordinate descent algorithm is proposed to solve the online updating optimization problem. Moreover, a tuning parameter selection is suggested in an online updating way. The selection and estimation consistencies, and the oracle property are established, theoretically. Our methods are further examined and illustrated by various numerical examples from both simulation experiments and a real data analysis.
READ FULL TEXT