On the Computational Efficiency of Training Neural Networks

10/05/2014
by   Roi Livni, et al.
0

It is well-known that neural networks are computationally hard to train. On the other hand, in practice, modern day neural networks are trained efficiently using SGD and a variety of tricks that include different activation functions (e.g. ReLU), over-specification (i.e., train networks which are larger than needed), and regularization. In this paper we revisit the computational complexity of training neural networks from a modern perspective. We provide both positive and negative results, some of them yield new provably efficient and practical algorithms for training certain types of neural networks.

READ FULL TEXT
research
02/01/2019

DANTE: Deep AlterNations for Training nEural networks

We present DANTE, a novel method for training neural networks using the ...
research
10/07/2021

On the Generalization of Models Trained with SGD: Information-Theoretic Bounds and Implications

This paper follows up on a recent work of (Neu, 2021) and presents new a...
research
06/28/2015

Beating the Perils of Non-Convexity: Guaranteed Training of Neural Networks using Tensor Methods

Training neural networks is a challenging non-convex optimization proble...
research
06/22/2022

Consistency of Neural Networks with Regularization

Neural networks have attracted a lot of attention due to its success in ...
research
05/24/2023

Utility-Probability Duality of Neural Networks

It is typically understood that the training of modern neural networks i...
research
02/23/2023

Testing Stationarity Concepts for ReLU Networks: Hardness, Regularity, and Robust Algorithms

We study the computational problem of the stationarity test for the empi...
research
08/04/2023

A stochastic optimization approach to train non-linear neural networks with a higher-order variation regularization

While highly expressive parametric models including deep neural networks...

Please sign up or login with your details

Forgot password? Click here to reset