Rethinking Data Augmentation: Self-Supervision and Self-Distillation

10/14/2019
by   Hankook Lee, et al.
13

Data augmentation techniques, e.g., flipping or cropping, which systematically enlarge the training dataset by explicitly generating more training samples, are effective in improving the generalization performance of deep neural networks. In the supervised setting, a common practice for data augmentation is to assign the same label to all augmented samples of the same source. However, if the augmentation results in large distributional discrepancy among them (e.g., rotations), forcing their label invariance may be too difficult to solve and often hurts the performance. To tackle this challenge, we suggest a simple yet effective idea of learning the joint distribution of the original and self-supervised labels of augmented samples. The joint learning framework is easier to train, and enables an aggregated inference combining the predictions from different augmented samples for improving the performance. Further, to speed up the aggregation process, we also propose a knowledge transfer technique, self-distillation, which transfers the knowledge of augmentation into the model itself. We demonstrate the effectiveness of our data augmentation framework on various fully-supervised settings including the few-shot and imbalanced classification scenarios.

READ FULL TEXT
research
10/29/2020

Self-paced Data Augmentation for Training Neural Networks

Data augmentation is widely used for machine learning; however, an effec...
research
11/01/2022

SADT: Combining Sharpness-Aware Minimization with Self-Distillation for Improved Model Generalization

Methods for improving deep neural network training times and model gener...
research
06/06/2023

Augmenting Reddit Posts to Determine Wellness Dimensions impacting Mental Health

Amid ongoing health crisis, there is a growing necessity to discern poss...
research
10/15/2022

Data-Efficient Augmentation for Training Neural Networks

Data augmentation is essential to achieve state-of-the-art performance i...
research
04/26/2022

Reprint: a randomized extrapolation based on principal components for data augmentation

Data scarcity and data imbalance have attracted a lot of attention in ma...
research
04/25/2022

VITA: A Multi-Source Vicinal Transfer Augmentation Method for Out-of-Distribution Generalization

Invariance to diverse types of image corruption, such as noise, blurring...
research
06/20/2019

Data Interpolating Prediction: Alternative Interpretation of Mixup

Data augmentation by mixing samples, such as Mixup, has widely been used...

Please sign up or login with your details

Forgot password? Click here to reset