Analysis and Comparison of Classification Metrics

09/12/2022
by   Luciana Ferrer, et al.
0

A number of different performance metrics are commonly used in the machine learning literature for classification systems that output categorical decisions. Some of the most common ones are accuracy, total error (one minus accuracy), balanced accuracy, balanced total error (one minus balanced accuracy), F-score, and Matthews correlation coefficient (MCC). In this document, we review the definition of these metrics and compare them with the expected cost (EC), a metric introduced in every statistical learning course but rarely used in the machine learning literature. We show that the empirical estimate of the EC is a generalized version of both the total error and balanced total error. Further, we show its relation with F-score and MCC and argue that EC is superior to them, being more general, simpler, intuitive and well motivated. We highlight some issues with the F-score and the MCC that make them suboptimal metrics. While not explained in the current version of this manuscript, where we focus exclusively on metrics that are computed over hard decisions, the EC has the additional advantage of being a great tool to measure calibration of a system's scores and allows users to make optimal decisions given a set of posteriors for each class. We leave that discussion for a future version of this manuscript.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/23/2022

What is Your Metric Telling You? Evaluating Classifier Calibration under Context-Specific Definitions of Reliability

Classifier calibration has received recent attention from the machine le...
research
03/28/2013

Relevance As a Metric for Evaluating Machine Learning Algorithms

In machine learning, the choice of a learning algorithm that is suitable...
research
09/08/2021

Estimating Expected Calibration Errors

Uncertainty in probabilistic classifiers predictions is a key concern wh...
research
06/05/2022

Never mind the metrics – what about the uncertainty? Visualising confusion matrix metric distributions

There are strong incentives to build models that demonstrate outstanding...
research
12/12/2011

Threshold Choice Methods: the Missing Link

Many performance metrics have been introduced for the evaluation of clas...
research
09/15/2023

Performance Metrics for Probabilistic Ordinal Classifiers

Ordinal classification models assign higher penalties to predictions fur...
research
05/10/2023

Pearson-Matthews correlation coefficients for binary and multinary classification and hypothesis testing

The Pearson-Matthews correlation coefficient (usually abbreviated MCC) i...

Please sign up or login with your details

Forgot password? Click here to reset