Learning with Average Top-k Loss
In this work, we introduce the average top-k (AT_k) loss as a new ensemble loss for supervised learning, which is the average over the k largest individual losses over a training dataset. We show that the AT_k loss is a natural generalization of the two widely used ensemble losses, namely the average loss and the maximum loss, but can combines their advantages and mitigate their drawbacks to better adapt to different data distributions. Furthermore, it remains a convex function over all individual losses, which can lead to convex optimization problems that can be solved effectively with conventional gradient-based method. We provide an intuitive interpretation of the AT_k loss based on its equivalent effect on the continuous individual loss functions, suggesting that it can reduce the penalty on correctly classified data. We further give a learning theory analysis of MAT_k learning on the classification calibration of the AT_k loss and the error bounds of AT_k-SVM. We demonstrate the applicability of minimum average top-k learning for binary classification and regression using synthetic and real datasets.
READ FULL TEXT 
  
  
     share
 share