Directly and Efficiently Optimizing Prediction Error and AUC of Linear Classifiers

02/07/2018
by   Hiva Ghanbari, et al.
0

The predictive quality of machine learning models is typically measured in terms of their (approximate) expected prediction error or the so-called Area Under the Curve (AUC) for a particular data distribution. However, when the models are constructed by the means of empirical risk minimization, surrogate functions such as the logistic loss are optimized instead. This is done because the empirical approximations of the expected error and AUC functions are nonconvex and nonsmooth, and more importantly have zero derivative almost everywhere. In this work, we show that in the case of linear predictors, and under the assumption that the data has normal distribution, the expected error and the expected AUC are not only smooth, but have closed form expressions, which depend on the first and second moments of the normal distribution. Hence, we derive derivatives of these two functions and use these derivatives in an optimization algorithm to directly optimize the expected error and the AUC. In the case of real data sets, the derivatives can be approximated using empirical moments. We show that even when data is not normally distributed, computed derivatives are sufficiently useful to render an efficient optimization method and high quality solutions. Thus, we propose a gradient-based optimization method for direct optimization of the prediction error and AUC. Moreover, the per-iteration complexity of the proposed algorithm has no dependence on the size of the data set, unlike those for optimizing logistic regression and all other well known empirical risk minimization problems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/28/2019

Novel and Efficient Approximations for Zero-One Loss of Linear Classifiers

The predictive quality of machine learning models is typically measured ...
research
07/02/2021

Optimizing ROC Curves with a Sort-Based Surrogate Loss Function for Binary Classification and Changepoint Detection

Receiver Operating Characteristic (ROC) curves are plots of true positiv...
research
08/03/2012

On the Consistency of AUC Pairwise Optimization

AUC (area under ROC curve) is an important evaluation criterion, which h...
research
07/28/2021

Learning with Multiclass AUC: Theory and Algorithms

The Area under the ROC curve (AUC) is a well-known ranking metric for pr...
research
05/13/2016

Support Vector Algorithms for Optimizing the Partial Area Under the ROC Curve

The area under the ROC curve (AUC) is a widely used performance measure ...
research
09/23/2020

Online AUC Optimization for Sparse High-Dimensional Datasets

The Area Under the ROC Curve (AUC) is a widely used performance measure ...
research
10/12/2022

FasterRisk: Fast and Accurate Interpretable Risk Scores

Over the last century, risk scores have been the most popular form of pr...

Please sign up or login with your details

Forgot password? Click here to reset