On the Role of Noise in the Sample Complexity of Learning Recurrent Neural Networks: Exponential Gaps for Long Sequences
We consider the class of noisy multi-layered sigmoid recurrent neural networks with w (unbounded) weights for classification of sequences of length T, where independent noise distributed according to 𝒩(0,σ^2) is added to the output of each neuron in the network. Our main result shows that the sample complexity of PAC learning this class can be bounded by O (wlog(T/σ)). For the non-noisy version of the same class (i.e., σ=0), we prove a lower bound of Ω (wT) for the sample complexity. Our results indicate an exponential gap in the dependence of sample complexity on T for noisy versus non-noisy networks. Moreover, given the mild logarithmic dependence of the upper bound on 1/σ, this gap still holds even for numerically negligible values of σ.
READ FULL TEXT