The Loss Surfaces of Neural Networks with General Activation Functions

04/08/2020
by   Nicholas P. Baskerville, et al.
0

We present results extending the foundational work of Choromanska et al (2015) on the complexity of the loss surfaces of multi-layer neural networks. We remove the strict reliance on specifically ReLU activation functions and obtain broadly the same results for general activation functions. This is achieved with piece-wise linear approximations to general activation functions, Kac-Rice calculations akin to those of Auffinger, Ben Arous and Černỳ (2013) and asymptotic analysis made possible by supersymmetric methods. Our results strengthen the case for the conclusions of Choromanska et al (2015) and the calculations contain various novel details required to deal with certain perturbations to the classical spin-glass calculations.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset