Pair-Wise Cluster Analysis

09/19/2010
by   David R. Hardoon, et al.
Uppsala universitet
Apple Inc
0

This paper studies the problem of learning clusters which are consistently present in different (continuously valued) representations of observed data. Our setup differs slightly from the standard approach of (co-) clustering as we use the fact that some form of `labeling' becomes available in this setup: a cluster is only interesting if it has a counterpart in the alternative representation. The contribution of this paper is twofold: (i) the problem setting is explored and an analysis in terms of the PAC-Bayesian theorem is presented, (ii) a practical kernel-based algorithm is derived exploiting the inherent relation to Canonical Correlation Analysis (CCA), as well as its extension to multiple views. A content based information retrieval (CBIR) case study is presented on the multi-lingual aligned Europal document dataset which supports the above findings.

READ FULL TEXT

page 2

page 11

05/24/2018

An experimental comparison of label selection methods for hierarchical document clusters

The focus of this paper is on the evaluation of sixteen labeling methods...
04/15/2019

Multiple kernel learning for integrative consensus clustering of genomic datasets

Diverse applications - particularly in tumour subtyping - have demonstra...
11/29/2018

Robust Bayesian Cluster Enumeration

A major challenge in cluster analysis is that the number of data cluster...
09/02/2010

A PAC-Bayesian Analysis of Graph Clustering and Pairwise Clustering

We formulate weighted graph clustering as a prediction problem: given a ...
08/28/2012

Document Clustering Evaluation: Divergence from a Random Baseline

Divergence from a random baseline is a technique for the evaluation of d...

Please sign up or login with your details

Forgot password? Click here to reset