Fast Evaluation and Approximation of the Gauss-Newton Hessian Matrix for the Multilayer Perceptron

10/27/2019
by   Chao Chen, et al.
0

We introduce a fast algorithm for entry-wise evaluation of the Gauss-Newton Hessian (GNH) matrix for the multilayer perceptron. The algorithm has a precomputation step and a sampling step. While it generally requires O(Nn) work to compute an entry (and the entire column) in the GNH matrix for a neural network with N parameters and n data points, our fast sampling algorithm reduces the cost to O(n+d/ϵ^2) work, where d is the output dimension of the network and ϵ is a prescribed accuracy (independent of N). One application of our algorithm is constructing the hierarchical-matrix () approximation of the GNH matrix for solving linear systems and eigenvalue problems. While it generally requires O(N^2) memory and O(N^3) work to store and factorize the GNH matrix, respectively. The approximation requires only (N r_o) memory footprint and (N r_o^2) work to be factorized, where r_o ≪ N is the maximum rank of off-diagonal blocks in the GNH matrix. We demonstrate the performance of our fast algorithm and the approximation on classification and autoencoder neural networks.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset