Noether: The More Things Change, the More Stay the Same

by   Grzegorz Głuch, et al.

Symmetries have proven to be important ingredients in the analysis of neural networks. So far their use has mostly been implicit or seemingly coincidental. We undertake a systematic study of the role that symmetry plays. In particular, we clarify how symmetry interacts with the learning algorithm. The key ingredient in our study is played by Noether's celebrated theorem which, informally speaking, states that symmetry leads to conserved quantities (e.g., conservation of energy or conservation of momentum). In the realm of neural networks under gradient descent, model symmetries imply restrictions on the gradient path. E.g., we show that symmetry of activation functions leads to boundedness of weight matrices, for the specific case of linear activations it leads to balance equations of consecutive layers, data augmentation leads to gradient paths that have "momentum"-type restrictions, and time symmetry leads to a version of the Neural Tangent Kernel. Symmetry alone does not specify the optimization path, but the more symmetries are contained in the model the more restrictions are imposed on the path. Since symmetry also implies over-parametrization, this in effect implies that some part of this over-parametrization is cancelled out by the existence of the conserved quantities. Symmetry can therefore be thought of as one further important tool in understanding the performance of neural networks under gradient descent.


page 1

page 2

page 3

page 4


A Geometric Approach of Gradient Descent Algorithms in Neural Networks

In this article we present a geometric framework to analyze convergence ...

Discrete Lagrangian Neural Networks with Automatic Symmetry Discovery

By one of the most fundamental principles in physics, a dynamical system...

Symmetry-invariant optimization in deep networks

Recent works have highlighted scale invariance or symmetry that is prese...

Improving Simulations with Symmetry Control Neural Networks

The dynamics of physical systems is often constrained to lower dimension...

Spurious Local Minima of Shallow ReLU Networks Conform with the Symmetry of the Target Model

We consider the optimization problem associated with fitting two-layer R...

Neural Mechanics: Symmetry and Broken Conservation Laws in Deep Learning Dynamics

Predicting the dynamics of neural network parameters during training is ...

Relaxing Equivariance Constraints with Non-stationary Continuous Filters

Equivariances provide useful inductive biases in neural network modeling...

Please sign up or login with your details

Forgot password? Click here to reset