AdaPlus: Integrating Nesterov Momentum and Precise Stepsize Adjustment on AdamW Basis

09/05/2023
by   Lei Guan, et al.
0

This paper proposes an efficient optimizer called AdaPlus which integrates Nesterov momentum and precise stepsize adjustment on AdamW basis. AdaPlus combines the advantages of AdamW, Nadam, and AdaBelief and, in particular, does not introduce any extra hyper-parameters. We perform extensive experimental evaluations on three machine learning tasks to validate the effectiveness of AdaPlus. The experiment results validate that AdaPlus (i) is the best adaptive method which performs most comparable with (even slightly better than) SGD with momentum on image classification tasks and (ii) outperforms other state-of-the-art optimizers on language modeling tasks and illustrates the highest stability when training GANs. The experiment code of AdaPlus is available at: https://github.com/guanleics/AdaPlus.

READ FULL TEXT
research
10/15/2020

AdaBelief Optimizer: Adapting Stepsizes by the Belief in Observed Gradients

Most popular optimizers for deep learning can be broadly categorized as ...
research
02/16/2021

GradInit: Learning to Initialize Neural Networks for Stable and Efficient Training

Changes in neural architectures have fostered significant breakthroughs ...
research
10/04/2021

Effectiveness of Optimization Algorithms in Deep Image Classification

Adam is applied widely to train neural networks. Different kinds of Adam...
research
07/19/2022

Moment Centralization based Gradient Descent Optimizers for Convolutional Neural Networks

Convolutional neural networks (CNNs) have shown very appealing performan...
research
09/05/2023

A Simple Asymmetric Momentum Make SGD Greatest Again

We propose the simplest SGD enhanced method ever, Loss-Controlled Asymme...
research
02/02/2023

On Suppressing Range of Adaptive Stepsizes of Adam to Improve Generalisation Performance

A number of recent adaptive optimizers improve the generalisation perfor...
research
11/04/2020

EAdam Optimizer: How ε Impact Adam

Many adaptive optimization methods have been proposed and used in deep l...

Please sign up or login with your details

Forgot password? Click here to reset