A Tunable Loss Function for Classification

by   Tyler Sypherd, et al.

Recently, a parametrized class of loss functions called α-loss, α∈ [1,∞], has been introduced for classification. This family, which includes the log-loss and the 0-1 loss as special cases, comes with compelling properties including an equivalent margin-based form which is classification-calibrated for all α. We introduce a generalization of this family to the entire range of α∈ (0,∞] and establish how the parameter α enables the practitioner to choose among a host of operating conditions that are important in modern machine learning tasks. We prove that smaller α values are more conducive to faster optimization; in fact, α-loss is convex for α< 1 and quasi-convex for α >1. Moreover, we establish bounds to quantify the degradation of the local-quasi-convexity of the optimization landscape as α increases; we show that this directly translates to a computational slow down. On the other hand, our theoretical results also suggest that larger α values lead to better generalization performance. This is a consequence of the ability of the α-loss to limit the effect of less likely data as α increases from 1, thereby facilitating robustness to outliers and noise in the training data. We provide strong evidence supporting this assertion with several experiments on benchmark datasets that establish the efficacy of α-loss for α > 1 in robustness to errors in the training data. Of equal interest is the fact that, for α < 1, our experiments show that the decreased robustness seems to counteract class imbalances in training data.


page 1

page 2

page 3

page 4


On the alpha-loss Landscape in the Logistic Model

We analyze the optimization landscape of a recently introduced tunable c...

Boosting in the presence of outliers: adaptive classification with non-convex loss functions

This paper examines the role and efficiency of the non-convex loss funct...

Searching for Robustness: Loss Learning for Noisy Classification Tasks

We present a "learning to learn" approach for automatically constructing...

Generalization Bounds in the Predict-then-Optimize Framework

The predict-then-optimize framework is fundamental in many practical set...

Visualizing the Loss Landscape of Neural Nets

Neural network training relies on our ability to find "good" minimizers ...

Exploiting Uncertainty of Loss Landscape for Stochastic Optimization

We introduce novel variants of momentum by incorporating the variance of...

Adopting Robustness and Optimality in Fitting and Learning

We generalized a modified exponentialized estimator by pushing the robus...

Please sign up or login with your details

Forgot password? Click here to reset