Re-parameterizing Your Optimizers rather than Architectures

05/30/2022
by   Xiaohan Ding, et al.
7

The well-designed structures in neural networks reflect the prior knowledge incorporated into the models. However, though different models have various priors, we are used to training them with model-agnostic optimizers (e.g., SGD). In this paper, we propose a novel paradigm of incorporating model-specific prior knowledge into optimizers and using them to train generic (simple) models. As an implementation, we propose a novel methodology to add prior knowledge by modifying the gradients according to a set of model-specific hyper-parameters, which is referred to as Gradient Re-parameterization, and the optimizers are named RepOptimizers. For the extreme simplicity of model structure, we focus on a VGG-style plain model and showcase that such a simple model trained with a RepOptimizer, which is referred to as RepOpt-VGG, performs on par with the recent well-designed models. From a practical perspective, RepOpt-VGG is a favorable base model because of its simple structure, high inference speed and training efficiency. Compared to Structural Re-parameterization, which adds priors into models via constructing extra training-time structures, RepOptimizers require no extra forward/backward computations and solve the problem of quantization. The code and models are publicly available at https://github.com/DingXiaoH/RepOptimizers.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/11/2021

RepVGG: Making VGG-style ConvNets Great Again

We present a simple but powerful architecture of convolutional neural ne...
research
03/13/2020

Dynamic transformation of prior knowledge into Bayesian models for data streams

We consider how to effectively use prior knowledge when learning a Bayes...
research
03/23/2023

NOPE: Novel Object Pose Estimation from a Single Image

The practicality of 3D object pose estimation remains limited for many a...
research
04/02/2022

Online Convolutional Re-parameterization

Structural re-parameterization has drawn increasing attention in various...
research
06/01/2021

Prior-Enhanced Few-Shot Segmentation with Meta-Prototypes

Few-shot segmentation (FSS) performance has been extensively promoted by...
research
09/05/2023

SAM-Deblur: Let Segment Anything Boost Image Deblurring

Image deblurring is a critical task in the field of image restoration, a...
research
06/11/2021

A Novel Approach to Lifelong Learning: The Plastic Support Structure

We propose a novel approach to lifelong learning, introducing a compact ...

Please sign up or login with your details

Forgot password? Click here to reset