Mechanism of feature learning in convolutional neural networks

by   Daniel Beaglehole, et al.

Understanding the mechanism of how convolutional neural networks learn features from image data is a fundamental problem in machine learning and computer vision. In this work, we identify such a mechanism. We posit the Convolutional Neural Feature Ansatz, which states that covariances of filters in any convolutional layer are proportional to the average gradient outer product (AGOP) taken with respect to patches of the input to that layer. We present extensive empirical evidence for our ansatz, including identifying high correlation between covariances of filters and patch-based AGOPs for convolutional layers in standard neural architectures, such as AlexNet, VGG, and ResNets pre-trained on ImageNet. We also provide supporting theoretical evidence. We then demonstrate the generality of our result by using the patch-based AGOP to enable deep feature learning in convolutional kernel machines. We refer to the resulting algorithm as (Deep) ConvRFM and show that our algorithm recovers similar features to deep convolutional networks including the notable emergence of edge detectors. Moreover, we find that Deep ConvRFM overcomes previously identified limitations of convolutional kernels, such as their inability to adapt to local signals in images and, as a result, leads to sizable performance improvement over fixed convolutional kernels.


page 3

page 4

page 7

page 8

page 20

page 21

page 22


Feature learning in neural networks and kernel machines that recursively learn features

Neural networks have achieved impressive results on many technological a...

Pixel Adaptive Filtering Units

State-of-the-art methods for computer vision rely heavily on the transla...

Why Convolutional Networks Learn Oriented Bandpass Filters: Theory and Empirical Support

It has been repeatedly observed that convolutional architectures when ap...

Robust Gabor Networks

This work takes a step towards investigating the benefits of merging cla...

On Approximation in Deep Convolutional Networks: a Kernel Perspective

The success of deep convolutional networks on on tasks involving high-di...

Inverse Halftoning Through Structure-Aware Deep Convolutional Neural Networks

The primary issue in inverse halftoning is removing noisy dots on flat a...

Local Unsupervised Learning for Image Analysis

Local Hebbian learning is believed to be inferior in performance to end-...

Please sign up or login with your details

Forgot password? Click here to reset