SoftMatch: Addressing the Quantity-Quality Trade-off in Semi-supervised Learning

01/26/2023
by   Hao Chen, et al.
0

The critical challenge of Semi-Supervised Learning (SSL) is how to effectively leverage the limited labeled data and massive unlabeled data to improve the model's generalization performance. In this paper, we first revisit the popular pseudo-labeling methods via a unified sample weighting formulation and demonstrate the inherent quantity-quality trade-off problem of pseudo-labeling with thresholding, which may prohibit learning. To this end, we propose SoftMatch to overcome the trade-off by maintaining both high quantity and high quality of pseudo-labels during training, effectively exploiting the unlabeled data. We derive a truncated Gaussian function to weight samples based on their confidence, which can be viewed as a soft version of the confidence threshold. We further enhance the utilization of weakly-learned classes by proposing a uniform alignment approach. In experiments, SoftMatch shows substantial improvements across a wide variety of benchmarks, including image, text, and imbalanced classification.

READ FULL TEXT
research
03/13/2023

InPL: Pseudo-labeling the Inliers First for Imbalanced Semi-supervised Learning

Recent state-of-the-art methods in imbalanced semi-supervised learning (...
research
06/13/2022

EnergyMatch: Energy-based Pseudo-Labeling for Semi-Supervised Learning

Recent state-of-the-art methods in semi-supervised learning (SSL) combin...
research
01/16/2020

Curriculum Labeling: Self-paced Pseudo-Labeling for Semi-Supervised Learning

Semi-supervised learning aims to take advantage of a large amount of unl...
research
01/15/2023

On Pseudo-Labeling for Class-Mismatch Semi-Supervised Learning

When there are unlabeled Out-Of-Distribution (OOD) data from other class...
research
06/13/2022

Confident Sinkhorn Allocation for Pseudo-Labeling

Semi-supervised learning is a critical tool in reducing machine learning...
research
11/17/2022

NorMatch: Matching Normalizing Flows with Discriminative Classifiers for Semi-Supervised Learning

Semi-Supervised Learning (SSL) aims to learn a model using a tiny labele...
research
08/13/2021

Progressive Representative Labeling for Deep Semi-Supervised Learning

Deep semi-supervised learning (SSL) has experienced significant attentio...

Please sign up or login with your details

Forgot password? Click here to reset