A closer look at the approximation capabilities of neural networks

by   Kai Fong Ernest Chong, et al.

The universal approximation theorem, in one of its most general versions, says that if we consider only continuous activation functions σ, then a standard feedforward neural network with one hidden layer is able to approximate any continuous multivariate function f to any given approximation threshold ε, if and only if σ is non-polynomial. In this paper, we give a direct algebraic proof of the theorem. Furthermore we shall explicitly quantify the number of hidden units required for approximation. Specifically, if X⊆R^n is compact, then a neural network with n input units, m output units, and a single hidden layer with n+dd hidden units (independent of m and ε), can uniformly approximate any polynomial function f:X →R^m whose total degree is at most d for each of its m coordinate functions. In the general case that f is any continuous function, we show there exists some N∈O(ε^-n) (independent of m), such that N hidden units would suffice to approximate f. We also show that this uniform approximation property (UAP) still holds even under seemingly strong conditions imposed on the weights. We highlight several consequences: (i) For any δ > 0, the UAP still holds if we restrict all non-bias weights w in the last layer to satisfy |w| < δ. (ii) There exists some λ>0 (depending only on f and σ), such that the UAP still holds if we restrict all non-bias weights w in the first layer to satisfy |w|>λ. (iii) If the non-bias weights in the first layer are fixed and randomly chosen from a suitable range, then the UAP holds with probability 1.


page 1

page 2

page 3

page 4


The universal approximation theorem for complex-valued neural networks

We generalize the classical universal approximation theorem for neural n...

Asymptotic Properties of Neural Network Sieve Estimators

Neural networks are one of the most popularly used methods in machine le...

On approximating ∇ f with neural networks

Consider a feedforward neural network ψ: R^d→R^d such that ψ≈∇ f, where ...

Universal Approximation in Dropout Neural Networks

We prove two universal approximation theorems for a range of dropout neu...

Two-hidden-layer Feedforward Neural Networks are Universal Approximators: A Constructive Approach

It is well known that Artificial Neural Networks are universal approxima...

Efficient Design of Neural Networks with Random Weights

Single layer feedforward networks with random weights are known for thei...

LU decomposition and Toeplitz decomposition of a neural network

It is well-known that any matrix A has an LU decomposition. Less well-kn...

Please sign up or login with your details

Forgot password? Click here to reset