AGIL: Learning Attention from Human for Visuomotor Tasks

06/01/2018
by   Ruohan Zhang, et al.
1

When intelligent agents learn visuomotor behaviors from human demonstrations, they may benefit from knowing where the human is allocating visual attention, which can be inferred from their gaze. A wealth of information regarding intelligent decision making is conveyed by human gaze allocation; hence, exploiting such information has the potential to improve the agents' performance. With this motivation, we propose the AGIL (Attention Guided Imitation Learning) framework. We collect high-quality human action and gaze data while playing Atari games in a carefully controlled experimental setting. Using these data, we first train a deep neural network that can predict human gaze positions and visual attention with high accuracy (the gaze network) and then train another network to predict human actions (the policy network). Incorporating the learned attention model from the gaze network into the policy network significantly improves the action prediction accuracy and task performance.

READ FULL TEXT

page 2

page 5

page 7

page 9

page 11

page 17

research
02/25/2021

Gaze-Informed Multi-Objective Imitation Learning from Human Demonstrations

In the field of human-robot interaction, teaching learning agents from h...
research
12/05/2020

Selective Eye-gaze Augmentation To Enhance Imitation Learning In Atari Games

This paper presents the selective use of eye-gaze information in learnin...
research
02/28/2020

Efficiently Guiding Imitation Learning Algorithms with Human Gaze

Human gaze is known to be an intention-revealing signal in human demonst...
research
07/10/2019

Utilizing Eye Gaze to Enhance the Generalization of Imitation Networks to Unseen Environments

Vision-based autonomous driving through imitation learning mimics the be...
research
04/17/2019

Gaze Training by Modulated Dropout Improves Imitation Learning

Imitation learning by behavioral cloning is a prevalent method which has...
research
07/19/2017

Supervising Neural Attention Models for Video Captioning by Human Gaze Data

The attention mechanisms in deep neural networks are inspired by human's...
research
01/14/2021

Ensemble of LSTMs and feature selection for human action prediction

As robots are becoming more and more ubiquitous in human environments, i...

Please sign up or login with your details

Forgot password? Click here to reset