Passed & Spurious: analysing descent algorithms and local minima in spiked matrix-tensor model

by   Stefano Sarao Mannelli, et al.

In this work we analyse quantitatively the interplay between the loss landscape and performance of descent algorithms in a prototypical inference problem, the spiked matrix-tensor model. We study a loss function that is the negative log-likelihood of the model. We analyse the number of local minima at a fixed distance from the signal/spike with the Kac-Rice formula, and locate trivialization of the landscape at large signal-to-noise ratios. We evaluate in a closed form the performance of a gradient flow algorithm using integro-differential PDEs as developed in physics of disordered systems for the Langevin dynamics. We analyze the performance of an approximate message passing algorithm estimating the maximum likelihood configuration via its state evolution. We conclude by comparing the above results: while we observe a drastic slow down of the gradient flow dynamics even in the region where the landscape is trivial, both the analyzed algorithms are shown to perform well even in the part of the region of parameters where spurious local minima are present.


page 3

page 17


Who is Afraid of Big Bad Minima? Analysis of Gradient-Flow in a Spiked Matrix-Tensor Model

Gradient-based algorithms are effective for many machine learning tasks,...

Marvels and Pitfalls of the Langevin Algorithm in Noisy High-dimensional Inference

Gradient-descent-based algorithms and their stochastic versions have wid...

Algorithmic thresholds for tensor PCA

We study the algorithmic thresholds for principal component analysis of ...

Structure and Gradient Dynamics Near Global Minima of Two-layer Neural Networks

Under mild assumptions, we investigate the structure of loss landscape o...

Saving Gradient and Negative Curvature Computations: Finding Local Minima More Efficiently

We propose a family of nonconvex optimization algorithms that are able t...

The layer-wise L1 Loss Landscape of Neural Nets is more complex around local minima

For fixed training data and network parameters in the other layers the L...

Attractor Metadynamics in Adapting Neural Networks

Slow adaption processes, like synaptic and intrinsic plasticity, abound ...

Please sign up or login with your details

Forgot password? Click here to reset