Neural Network Reinforcement Learning for Audio-Visual Gaze Control in Human-Robot Interaction

by   Stéphane Lathuilière, et al.

This paper introduces a novel neural network-based reinforcement learning approach for robot gaze control. Our approach enables a robot to learn and adapt its gaze control strategy for human-robot interaction without the use of external sensors or human supervision. The robot learns to focus its attention on groups of people from its own audio-visual experiences, and independently of the number of people in the environment, their position and physical appearance. In particular, we use recurrent neural networks and Q-learning to find an optimal action-selection policy, and we pretrain on a synthetic environment that simulates sound sources and moving participants to avoid the need of interacting with people for hours. Our experimental evaluation suggests that the proposed method is robust in terms of parameters configuration (i.e. the selection of the parameter values has not a decisive impact on the performance). The best results are obtained when audio and video information are jointly used, and when a late fusion strategy is employed (i.e. when both sources of information are separately processed and then fused). Successful experiments on a real environment with the Nao robot indicate that our framework is a step forward towards the autonomous learning of a perceivable and socially acceptable gaze behavior.


page 7

page 13


Show, Attend and Interact: Perceivable Human-Robot Social Interaction through Neural Attention Q-Network

For a safe, natural and effective human-robot social interaction, it is ...

A trained humanoid robot can perform human-like crossmodal social attention conflict resolution

Due to the COVID-19 pandemic, robots could be seen as potential resource...

Gaze Cueing and the Role of Presence in Human-Robot Interaction

Gaze cueing is a fundamental part of social interactions, and broadly st...

A Data-Driven Approach for Contact Detection, Classification and Reaction in Physical Human-Robot Collaboration

This paper considers a scenario where a robot and a human operator share...

Gaze-based Attention Recognition for Human-Robot Collaboration

Attention (and distraction) recognition is a key factor in improving hum...

Tracking Gaze and Visual Focus of Attention of People Involved in Social Interaction

The visual focus of attention (VFOA) has been recognized as a prominent ...

The Magni Human Motion Dataset: Accurate, Complex, Multi-Modal, Natural, Semantically-Rich and Contextualized

Rapid development of social robots stimulates active research in human m...

Please sign up or login with your details

Forgot password? Click here to reset