Optimistic Adaptive Acceleration for Optimization

03/04/2019
by   Jun-Kun Wang, et al.
12

We consider a new variant of AMSGrad. AMSGrad RKK18 is a popular adaptive gradient based optimization algorithm that is widely used in training deep neural networks. Our new variant of the algorithm assumes that mini-batch gradients in consecutive iterations have some underlying structure, which makes the gradients sequentially predictable. By exploiting the predictability and some ideas from the field of Optimistic Online learning, the new algorithm can accelerate the convergence and enjoy a tighter regret bound. We conduct experiments on training various neural networks on several datasets to show that the proposed method speeds up the convergence in practice.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset