Training Fully Connected Neural Networks is ∃ℝ-Complete

by   Daniel Bertschinger, et al.

We consider the algorithmic problem of finding the optimal weights and biases for a two-layer fully connected neural network to fit a given set of data points. This problem is known as empirical risk minimization in the machine learning community. We show that the problem is ∃ℝ-complete. This complexity class can be defined as the set of algorithmic problems that are polynomial-time equivalent to finding real roots of a polynomial with integer coefficients. Our results hold even if the following restrictions are all added simultaneously. ∙ There are exactly two output neurons. ∙ There are exactly two input neurons. ∙ The data has only 13 different labels. ∙ The number of hidden neurons is a constant fraction of the number of data points. ∙ The target training error is zero. ∙ The ReLU activation function is used. This shows that even very simple networks are difficult to train. The result offers an explanation (though far from a complete understanding) on why only gradient descent is widely successful in training neural networks in practice. We generalize a recent result by Abrahamsen, Kleist and Miltzow [NeurIPS 2021]. This result falls into a recent line of research that tries to unveil that a series of central algorithmic problems from widely different areas of computer science and mathematics are ∃ℝ-complete: This includes the art gallery problem [JACM/STOC 2018], geometric packing [FOCS 2020], covering polygons with convex polygons [FOCS 2021], and continuous constraint satisfaction problems [FOCS 2021].


Training Neural Networks is ER-complete

Given a neural network, training data, and a threshold, it was known tha...

Network size and weights size for memorization with two-layers neural networks

In 1988, Eric B. Baum showed that two-layers neural networks with thresh...

A Convergence Theory for Deep Learning via Over-Parameterization

Deep neural networks (DNNs) have demonstrated dominating performance in ...

Computing Linear Restrictions of Neural Networks

A linear restriction of a function is the same function with its domain ...

Representing Matroids over the Reals is ∃ℝ-complete

A matroid M is an ordered pair (E,I), where E is a finite set called the...

Please sign up or login with your details

Forgot password? Click here to reset