Momentum^2 Teacher: Momentum Teacher with Momentum Statistics for Self-Supervised Learning
In this paper, we present a novel approach, Momentum^2 Teacher, for student-teacher based self-supervised learning. The approach performs momentum update on both network weights and batch normalization (BN) statistics. The teacher's weight is a momentum update of the student, and the teacher's BN statistics is a momentum update of those in history. The Momentum^2 Teacher is simple and efficient. It can achieve the state of the art results (74.5%) under ImageNet linear evaluation protocol using small-batch size(, 128), without requiring large-batch training on special hardware like TPU or inefficient across GPU operation (, shuffling BN, synced BN). Our implementation and pre-trained models will be given on GitHub[https://github.com/zengarden/momentum2-teacher].
READ FULL TEXT