Provably Learning Diverse Features in Multi-View Data with Midpoint Mixup

10/24/2022
by   Muthu Chidambaram, et al.
0

Mixup is a data augmentation technique that relies on training using random convex combinations of data points and their labels. In recent years, Mixup has become a standard primitive used in the training of state-of-the-art image classification models due to its demonstrated benefits over empirical risk minimization with regards to generalization and robustness. In this work, we try to explain some of this success from a feature learning perspective. We focus our attention on classification problems in which each class may have multiple associated features (or views) that can be used to predict the class correctly. Our main theoretical results demonstrate that, for a non-trivial class of data distributions with two features per class, training a 2-layer convolutional network using empirical risk minimization can lead to learning only one feature for almost all classes while training with a specific instantiation of Mixup succeeds in learning both features for every class. We also show empirically that these theoretical insights extend to the practical settings of image benchmarks modified to have additional synthetic features.

READ FULL TEXT
research
10/14/2021

Towards Understanding the Data Dependency of Mixup-style Training

In the Mixup training paradigm, a model is trained using convex combinat...
research
03/15/2023

The Benefits of Mixup for Feature Learning

Mixup, a simple data augmentation method that randomly mixes two data po...
research
06/10/2020

On Mixup Regularization

Mixup is a data augmentation technique that creates new examples as conv...
research
10/24/2022

Are Deep Sequence Classifiers Good at Non-Trivial Generalization?

Recent advances in deep learning models for sequence classification have...
research
06/26/2022

Multi-view Feature Augmentation with Adaptive Class Activation Mapping

We propose an end-to-end-trainable feature augmentation module built for...
research
01/30/2015

Multi-task Image Classification via Collaborative, Hierarchical Spike-and-Slab Priors

Promising results have been achieved in image classification problems by...
research
02/17/2022

Universality of empirical risk minimization

Consider supervised learning from i.i.d. samples { x_i,y_i}_i≤ n where x...

Please sign up or login with your details

Forgot password? Click here to reset