Multi-view predictive partitioning in high dimensions

02/02/2012
by   Brian McWilliams, et al.
0

Many modern data mining applications are concerned with the analysis of datasets in which the observations are described by paired high-dimensional vectorial representations or "views". Some typical examples can be found in web mining and genomics applications. In this article we present an algorithm for data clustering with multiple views, Multi-View Predictive Partitioning (MVPP), which relies on a novel criterion of predictive similarity between data points. We assume that, within each cluster, the dependence between multivariate views can be modelled by using a two-block partial least squares (TB-PLS) regression model, which performs dimensionality reduction and is particularly suitable for high-dimensional settings. The proposed MVPP algorithm partitions the data such that the within-cluster predictive ability between views is maximised. The proposed objective function depends on a measure of predictive influence of points under the TB-PLS model which has been derived as an extension of the PRESS statistic commonly used in ordinary least squares regression. Using simulated data, we compare the performance of MVPP to that of competing multi-view clustering methods which rely upon geometric structures of points, but ignore the predictive relationship between the two views. State-of-art results are obtained on benchmark web mining datasets.

READ FULL TEXT
research
09/13/2019

Multiple Partitions Aligned Clustering

Multi-view clustering is an important yet challenging task due to the di...
research
07/26/2020

Deep Embedded Multi-view Clustering with Collaborative Training

Multi-view clustering has attracted increasing attentions recently by ut...
research
07/23/2018

Multi-View Fuzzy Logic System with the Cooperation between Visible and Hidden Views

Multi-view datasets are frequently encountered in learning tasks, such a...
research
04/07/2020

Consistent and Complementary Graph Regularized Multi-view Subspace Clustering

This study investigates the problem of multi-view clustering, where mult...
research
09/17/2017

Learning Mixtures of Multi-Output Regression Models by Correlation Clustering for Multi-View Data

In many datasets, different parts of the data may have their own pattern...
research
03/21/2019

Latent Simplex Position Model: High Dimensional Multi-view Clustering with Uncertainty Quantification

High dimensional data often contain multiple facets, and several cluster...
research
06/22/2020

Multi-view redescription mining using tree-based multi-target prediction models

The task of redescription mining is concerned with re-describing differe...

Please sign up or login with your details

Forgot password? Click here to reset