Similarity-based Learning via Data Driven Embeddings

12/22/2011
by   Purushottam Kar, et al.
0

We consider the problem of classification using similarity/distance functions over data. Specifically, we propose a framework for defining the goodness of a (dis)similarity function with respect to a given learning task and propose algorithms that have guaranteed generalization properties when working with such good functions. Our framework unifies and generalizes the frameworks proposed by [Balcan-Blum ICML 2006] and [Wang et al ICML 2007]. An attractive feature of our framework is its adaptability to data - we do not promote a fixed notion of goodness but rather let data dictate it. We show, by giving theoretical guarantees that the goodness criterion best suited to a problem can itself be learned which makes our approach applicable to a variety of domains and problems. We propose a landmarking-based approach to obtaining a classifier from such learned goodness criteria. We then provide a novel diversity based heuristic to perform task-driven selection of landmark points instead of random selection. We demonstrate the effectiveness of our goodness criteria learning method as well as the landmark selection heuristic on a variety of similarity-based learning datasets and benchmark UCI datasets on which our method consistently outperforms existing approaches by a significant margin.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/22/2012

Supervised Learning with Similarity Functions

We address the problem of general supervised learning when data can only...
research
07/17/2013

Supervised Metric Learning with Generalization Guarantees

The crucial importance of metrics in machine learning algorithms has led...
research
04/29/2016

An expressive dissimilarity measure for relational clustering using neighbourhood trees

Clustering is an underspecified task: there are no universal criteria fo...
research
01/07/2022

Generalized quantum similarity learning

The similarity between objects is significant in a broad range of areas....
research
06/16/2022

Generalization Bounds for Data-Driven Numerical Linear Algebra

Data-driven algorithms can adapt their internal structure or parameters ...
research
06/27/2012

Similarity Learning for Provably Accurate Sparse Linear Classification

In recent years, the crucial importance of metrics in machine learning a...
research
06/27/2023

Simple Steps to Success: Axiomatics of Distance-Based Algorithmic Recourse

We propose a novel data-driven framework for algorithmic recourse that o...

Please sign up or login with your details

Forgot password? Click here to reset