Clustered regression with unknown clusters

03/23/2011
by   Kishor Barman, et al.
0

We consider a collection of prediction experiments, which are clustered in the sense that groups of experiments ex- hibit similar relationship between the predictor and response variables. The experiment clusters as well as the regres- sion relationships are unknown. The regression relation- ships define the experiment clusters, and in general, the predictor and response variables may not exhibit any clus- tering. We call this prediction problem clustered regres- sion with unknown clusters (CRUC) and in this paper we focus on linear regression. We study and compare several methods for CRUC, demonstrate their applicability to the Yahoo Learning-to-rank Challenge (YLRC) dataset, and in- vestigate an associated mathematical model. CRUC is at the crossroads of many prior works and we study several prediction algorithms with diverse origins: an adaptation of the expectation-maximization algorithm, an approach in- spired by K-means clustering, the singular value threshold- ing approach to matrix rank minimization under quadratic constraints, an adaptation of the Curds and Whey method in multiple regression, and a local regression (LoR) scheme reminiscent of neighborhood methods in collaborative filter- ing. Based on empirical evaluation on the YLRC dataset as well as simulated data, we identify the LoR method as a good practical choice: it yields best or near-best prediction performance at a reasonable computational load, and it is less sensitive to the choice of the algorithm parameter. We also provide some analysis of the LoR method for an asso- ciated mathematical model, which sheds light on optimal parameter choice and prediction performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/02/2022

VC-PCR: A Prediction Method based on Supervised Variable Selection and Clustering

Sparse linear prediction methods suffer from decreased prediction accura...
research
07/18/2022

Analyzing Clustered Continuous Response Variables with Ordinal Regression Models

Continuous response variables often need to be transformed to meet regre...
research
11/03/2020

Spatially Clustered Regression

Spatial regression or geographically weighted regression models have bee...
research
04/28/2018

Novel Prediction Techniques Based on Clusterwise Linear Regression

In this paper we explore different regression models based on Clusterwis...
research
07/05/2016

Algorithms for Generalized Cluster-wise Linear Regression

Cluster-wise linear regression (CLR), a clustering problem intertwined w...
research
07/20/2020

Spatially Clustered Varying Coefficient Model

In various applications with large spatial regions, the relationship bet...
research
11/20/2019

Mixtures of multivariate generalized linear models with overlapping clusters

With the advent of ubiquitous monitoring and measurement protocols, stud...

Please sign up or login with your details

Forgot password? Click here to reset