Classification with Nearest Disjoint Centroids

09/21/2021
by   Nicolas Fraiman, et al.
0

In this paper, we develop a new classification method based on nearest centroid, and it is called the nearest disjoint centroid classifier. Our method differs from the nearest centroid classifier in the following two aspects: (1) the centroids are defined based on disjoint subsets of features instead of all the features, and (2) the distance is induced by the dimensionality-normalized norm instead of the Euclidean norm. We provide a few theoretical results regarding our method. In addition, we propose a simple algorithm based on adapted k-means clustering that can find the disjoint subsets of features used in our method, and extend the algorithm to perform feature selection. We evaluate and compare the performance of our method to other closely related classifiers on both simulated data and real-world gene expression datasets. The results demonstrate that our method is able to outperform other competing classifiers by having smaller misclassification rates and/or using fewer features in various settings and situations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/09/2020

Biclustering with Alternating K-Means

Biclustering is the task of simultaneously clustering the rows and colum...
research
11/17/2019

Sparse ℓ_1 and ℓ_2 Center Classifiers

The nearest-centroid classifier is a simple linear-time classifier based...
research
02/08/2019

Nearest Neighbor Classifier based on Generalized Inter-point Distances for HDLSS Data

In high dimension, low sample size (HDLSS) settings, Euclidean distance ...
research
09/26/2019

CS Sparse K-means: An Algorithm for Cluster-Specific Feature Selection in High-Dimensional Clustering

Feature selection is an important and challenging task in high dimension...
research
04/05/2023

Selecting Features by their Resilience to the Curse of Dimensionality

Real-world datasets are often of high dimension and effected by the curs...
research
10/05/2018

IMMIGRATE: A Margin-based Feature Selection Method with Interaction Terms

By balancing margin-quantity maximization and margin-quality maximizatio...
research
10/19/2012

A Distance-Based Branch and Bound Feature Selection Algorithm

There is no known efficient method for selecting k Gaussian features fro...

Please sign up or login with your details

Forgot password? Click here to reset