Learning Representations by Maximizing Mutual Information Across Views

06/03/2019
by   Philip Bachman, et al.
0

We propose an approach to self-supervised representation learning based on maximizing mutual information between features extracted from multiple views of a shared context. For example, a context could be an image from ImageNet, and multiple views of the context could be generated by repeatedly applying data augmentation to the image. Following this approach, we develop a new model which maximizes mutual information between features extracted at multiple scales from independently-augmented copies of each input. Our model significantly outperforms prior work on the tasks we consider. Most notably, it achieves over 60 protocol. This improves on prior results by over 4 using the representations learned on ImageNet, our model achieves 50 This improves on prior results by 2 use mixture-based representations, segmentation behaviour emerges as a natural side-effect.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset