DBSCAN++: Towards fast and scalable density clustering

10/31/2018
by   Jennifer Jang, et al.
0

DBSCAN is a classical density-based clustering procedure which has had tremendous practical relevance. However, it implicitly needs to compute the empirical density for each sample point, leading to a quadratic worst-case time complexity, which may be too slow on large datasets. We propose DBSCAN++, a simple modification of DBSCAN which only requires computing the densities for a subset of the points. We show empirically that, compared to traditional DBSCAN, DBSCAN++ can provide not only competitive performance but also added robustness in the bandwidth hyperparameter while taking a fraction of the runtime. We also present statistical consistency guarantees showing the trade-off between computational cost and estimation rates. Surprisingly, up to a certain point, we can enjoy the same estimation rates while lowering computational cost, showing that DBSCAN++ is a sub-quadratic algorithm that attains minimax optimal rates for level-set estimation, a quality that may be of independent interest.

READ FULL TEXT

page 3

page 7

research
06/11/2020

Faster DBSCAN via subsampled similarity queries

DBSCAN is a popular density-based clustering algorithm. It computes the ...
research
07/21/2018

Linear density-based clustering with a discrete density model

Density-based clustering techniques are used in a wide range of data min...
research
07/11/2022

Fast Density-Peaks Clustering: Multicore-based Parallelization Approach

Clustering multi-dimensional points is a fundamental task in many fields...
research
11/10/2020

A Statistical Perspective on Coreset Density Estimation

Coresets have emerged as a powerful tool to summarize data by selecting ...
research
11/15/2019

Resource-Competitive Sybil Defenses

Proof-of-work(PoW) is an algorithmic tool used to secure networks by imp...
research
06/28/2023

Frontiers to the learning of nonparametric hidden Markov models

Hidden Markov models (HMMs) are flexible tools for clustering dependent ...
research
08/24/2020

Constructive Spherical Codes by Hopf Foliations

We present a new systematic approach to constructing spherical codes in ...

Please sign up or login with your details

Forgot password? Click here to reset