Making EfficientNet More Efficient: Exploring Batch-Independent Normalization, Group Convolutions and Reduced Resolution Training

06/07/2021
by   Dominic Masters, et al.
0

Much recent research has been dedicated to improving the efficiency of training and inference for image classification. This effort has commonly focused on explicitly improving theoretical efficiency, often measured as ImageNet validation accuracy per FLOP. These theoretical savings have, however, proven challenging to achieve in practice, particularly on high-performance training accelerators. In this work, we focus on improving the practical efficiency of the state-of-the-art EfficientNet models on a new class of accelerator, the Graphcore IPU. We do this by extending this family of models in the following ways: (i) generalising depthwise convolutions to group convolutions; (ii) adding proxy-normalized activations to match batch normalization performance with batch-independent statistics; (iii) reducing compute by lowering the training resolution and inexpensively fine-tuning at higher resolution. We find that these three methods improve the practical efficiency for both training and inference. Our code will be made available online.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/07/2021

Proxy-Normalizing Activations to Match Batch Normalization while Removing Batch Dependence

We investigate the reasons for the performance degradation incurred with...
research
02/11/2021

High-Performance Large-Scale Image Recognition Without Normalization

Batch normalization is a key component of most image classification mode...
research
02/10/2017

Batch Renormalization: Towards Reducing Minibatch Dependence in Batch-Normalized Models

Batch Normalization is quite effective at accelerating and improving the...
research
09/28/2020

Group Whitening: Balancing Learning Efficiency and Representational Capacity

Batch normalization (BN) is an important technique commonly incorporated...
research
03/22/2018

Group Normalization

Batch Normalization (BN) is a milestone technique in the development of ...
research
10/20/2020

BYOL works even without batch statistics

Bootstrap Your Own Latent (BYOL) is a self-supervised learning approach ...

Please sign up or login with your details

Forgot password? Click here to reset