Maintaining AUC and H-measure over time

12/12/2021
by   Nikolaj Tatti, et al.
0

Measuring the performance of a classifier is a vital task in machine learning. The running time of an algorithm that computes the measure plays a very small role in an offline setting, for example, when the classifier is being developed by a researcher. However, the running time becomes more crucial if our goal is to monitor the performance of a classifier over time. In this paper we study three algorithms for maintaining two measures. The first algorithm maintains area under the ROC curve (AUC) under addition and deletion of data points in O(log n) time. This is done by maintaining the data points sorted in a self-balanced search tree. In addition, we augment the search tree that allows us to query the ROC coordinates of a data point in O(log n) time. In doing so we are able to maintain AUC in O(log n) time. Our next two algorithms involve in maintaining H-measure, an alternative measure based on the ROC curve. Computing the measure is a two-step process: first we need to compute a convex hull of the ROC curve, followed by a sum over the convex hull. We demonstrate that we can maintain the convex hull using a minor modification of the classic convex hull maintenance algorithm. We then show that under certain conditions, we can compute the H-measure exactly in O(log^2 n) time, and if the conditions are not met, then we can estimate the H-measure in O((log n + ϵ^-1)log n) time. We show empirically that our methods are significantly faster than the baselines.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/09/2023

A simple and efficient preprocessing step for convex hull problem

The present paper is concerned with a recursive algorithm as a preproces...
research
01/04/2022

Polyline Simplification under the Local Fréchet Distance has Subcubic Complexity in 2D

Given a polyline on n vertices, the polyline simplification problem asks...
research
02/02/2019

Efficient estimation of AUC in a sliding window

In many applications, monitoring area under the ROC curve (AUC) in a sli...
research
07/01/2023

Efficient Algorithms for Euclidean Steiner Minimal Tree on Near-Convex Terminal Sets

The Euclidean Steiner Minimal Tree problem takes as input a set 𝒫 of poi...
research
10/12/2022

A nearly optimal randomized algorithm for explorable heap selection

Explorable heap selection is the problem of selecting the nth smallest v...
research
08/15/2020

On Efficient Low Distortion Ultrametric Embedding

A classic problem in unsupervised learning and data analysis is to find ...

Please sign up or login with your details

Forgot password? Click here to reset