On Higher-order Moments in Adam

10/15/2019
by   Zhanhong Jiang, et al.
17

In this paper, we investigate the popular deep learning optimization routine, Adam, from the perspective of statistical moments. While Adam is an adaptive lower-order moment based (of the stochastic gradient) method, we propose an extension namely, HAdam, which uses higher order moments of the stochastic gradient. Our analysis and experiments reveal that certain higher-order moments of the stochastic gradient are able to achieve better performance compared to the vanilla Adam algorithm. We also provide some analysis of HAdam related to odd and even moments to explain some intriguing and seemingly non-intuitive empirical results.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/16/2008

Higher Order Moments Generation by Mellin Transform for Compound Models of Clutter

The compound models of clutter statistics are found suitable to describe...
research
11/23/2019

Hypergeometric functions and moments of operators

We consider some hypergeometric functions and prove that they are elemen...
research
05/16/2020

Oscillating Statistical Moments for Speech Polarity Detection

An inversion of the speech polarity may have a dramatic detrimental effe...
research
01/04/2022

Sparse Non-Convex Optimization For Higher Moment Portfolio Management

One of the reasons that higher order moment portfolio optimization metho...
research
12/31/2019

Higher-order algorithms for nonlinearly parameterized adaptive control

A set of new adaptive control algorithms is presented. The algorithms ar...
research
06/16/2020

Additive Poisson Process: Learning Intensity of Higher-Order Interaction in Stochastic Processes

We present the Additive Poisson Process (APP), a novel framework that ca...
research
02/22/2017

MomentsNet: a simple learning-free method for binary image recognition

In this paper, we propose a new simple and learning-free deep learning n...

Please sign up or login with your details

Forgot password? Click here to reset