Towards Self-Supervised Gaze Estimation

03/21/2022
by   Arya Farkhondeh, et al.
13

Recent joint embedding-based self-supervised methods have surpassed standard supervised approaches on various image recognition tasks such as image classification. These self-supervised methods aim at maximizing agreement between features extracted from two differently transformed views of the same image, which results in learning an invariant representation with respect to appearance and geometric image transformations. However, the effectiveness of these approaches remains unclear in the context of gaze estimation, a structured regression task that requires equivariance under geometric transformations (e.g., rotations, horizontal flip). In this work, we propose SwAT, an equivariant version of the online clustering-based self-supervised approach SwAV, to learn more informative representations for gaze estimation. We identify the most effective image transformations for self-supervised pretraining and demonstrate that SwAT, with ResNet-50 and supported with uncurated unlabeled face images, outperforms state-of-the-art gaze estimation methods and supervised baselines in various experiments. In particular, we achieve up to 57 evaluation tasks on existing benchmarks (ETH-XGaze, Gaze360, and MPIIFaceGaze).

READ FULL TEXT

page 2

page 6

page 16

page 17

research
10/24/2022

Contrastive Representation Learning for Gaze Estimation

Self-supervised learning (SSL) has become prevalent for learning represe...
research
07/14/2021

Detection of Abnormal Behavior with Self-Supervised Gaze Estimation

Due to the recent outbreak of COVID-19, many classes, exams, and meeting...
research
12/04/2021

Ablation study of self-supervised learning for image classification

This project focuses on the self-supervised training of convolutional ne...
research
11/28/2018

Self-supervised Spatiotemporal Feature Learning by Video Geometric Transformations

To alleviate the expensive cost of data collection and annotation, many ...
research
02/17/2022

Survey on Self-supervised Representation Learning Using Image Transformations

Deep neural networks need huge amount of training data, while in real wo...
research
10/13/2020

Audio-Visual Self-Supervised Terrain Type Discovery for Mobile Platforms

The ability to both recognize and discover terrain characteristics is an...
research
03/04/2023

Self-Supervised Learning for Biologically-Inspired Place Representation Generalization across Appearance Changes

Visual place recognition is a key to unlocking spatial navigation for an...

Please sign up or login with your details

Forgot password? Click here to reset