Generalised Perceptron Learning
We present a generalisation of Rosenblatt's traditional perceptron learning algorithm to the class of proximal activation functions and demonstrate how this generalisation can be interpreted as an incremental gradient method applied to a novel energy function. This novel energy function is based on a generalised Bregman distance, for which the gradient with respect to the weights and biases does not require the differentiation of the activation function. The interpretation as an energy minimisation algorithm paves the way for many new algorithms, of which we explore a novel variant of the iterative soft-thresholding algorithm for the learning of sparse perceptrons.
READ FULL TEXT