An Analysis of Alternating Direction Method of Multipliers for Feed-forward Neural Networks

In this work, we present a hardware compatible neural network training algorithm in which we used alternating direction method of multipliers (ADMM) and iterative least-square methods. The motive behind this approach was to conduct a method of training neural networks that is scalable and can be parallelised. These characteristics make this algorithm suitable for hardware implementation. We have achieved 6.9% and 6.8% better accuracy comparing to SGD and Adam respectively, with a four-layer neural network with hidden size of 28 on HIGGS dataset. Likewise, we could observe 21.0% and 2.2% accuracy improvement comparing to SGD and Adam respectively, on IRIS dataset with a three-layer neural network with hidden size of 8. This is while the use of matrix inversion, which is challenging for hardware implementation, is avoided in this method. We assessed the impact of avoiding matrix inversion on ADMM accuracy and we observed that we can safely replace matrix inversion with iterative least-square methods and maintain the desired performance. Also, the computational complexity of the implemented method is polynomial regarding dimensions of the input dataset and hidden size of the network.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/31/2021

Training Recurrent Neural Networks by Sequential Least Squares and the Alternating Direction Method of Multipliers

For training recurrent neural network models of nonlinear dynamical syst...
research
12/22/2021

A Convergent ADMM Framework for Efficient Neural Network Training

As a well-known optimization framework, the Alternating Direction Method...
research
09/06/2017

An inner-loop free solution to inverse problems using deep neural networks

We propose a new method that uses deep learning techniques to accelerate...
research
04/25/2022

Using the Projected Belief Network at High Dimensions

The projected belief network (PBN) is a layered generative network (LGN)...
research
04/06/2016

Learning A Deep ℓ_∞ Encoder for Hashing

We investigate the ℓ_∞-constrained representation which demonstrates rob...
research
09/12/2023

Optimization Guarantees of Unfolded ISTA and ADMM Networks With Smooth Soft-Thresholding

Solving linear inverse problems plays a crucial role in numerous applica...

Please sign up or login with your details

Forgot password? Click here to reset