Lipschitz regularized Deep Neural Networks converge and generalize
Lipschitz regularized neural networks augment the usual fidelity term used in training with a regularization term corresponding the excess Lipschitz constant of the network compared to the Lipschitz constant of the data. We prove that Lipschitz regularized neural networks converge, and provide a rate, in the limit as the number of data points n→∞. We consider the regime where perfect fitting of data is possible, which means the size of the network grows with n. There are two regimes: in the case of perfect labels, we prove convergence to the label function which corresponds to zero loss. In the case of corrupted labels which occurs when the Lipschitz constant of the data blows up, we prove convergence to a regularized label function which is the solution of a limiting variational problem.
READ FULL TEXT