Robot Sound Interpretation: Learning Visual-Audio Representations for Voice-Controlled Robots

09/07/2021
by   Peixin Chang, et al.
0

Inspired by sensorimotor theory, we propose a novel pipeline for voice-controlled robots. Previous work relies on explicit labels of sounds and images as well as extrinsic reward functions. Not only do such approaches have little resemblance to human sensorimotor development, but also require hand-tuning rewards and extensive human labor. To address these problems, we learn a representation that associates images and sound commands with minimal supervision. Using this representation, we generate an intrinsic reward function to learn robotic tasks with reinforcement learning. We demonstrate our approach on three robot platforms, a TurtleBot3, a Kuka-IIWA arm, and a Kinova Gen3 robot, which hear a command word, identify the associated target object, and perform precise control to approach the target. We show that our method outperforms previous work across various sound types and robotic tasks empirically. We successfully deploy the policy learned in simulator to a real-world Kinova Gen3.

READ FULL TEXT

page 1

page 5

page 6

research
09/19/2019

Robot Sound Interpretation: Combining Sight and Sound in Learning-Based Control

We explore the interpretation of sound for robot decision-making, inspir...
research
01/23/2023

Learning Rewards and Skills to Follow Commands with A Data Efficient Visual-Audio Representation

Based on the recent advancements in representation learning, we propose ...
research
01/05/2020

Arduino based Voice controlled Robotic Arm

The aim of this work is to present an inexpensive, light-weight and easi...
research
03/02/2023

Self-Improving Robots: End-to-End Autonomous Visuomotor Reinforcement Learning

In imitation and reinforcement learning, the cost of human supervision l...
research
12/20/2016

Unsupervised Perceptual Rewards for Imitation Learning

Reward function design and exploration time are arguably the biggest obs...
research
08/04/2022

Impact Makes a Sound and Sound Makes an Impact: Sound Guides Representations and Explorations

Sound is one of the most informative and abundant modalities in the real...
research
12/21/2020

myGym: Modular Toolkit for Visuomotor Robotic Tasks

We introduce a novel virtual robotic toolkit myGym, developed for reinfo...

Please sign up or login with your details

Forgot password? Click here to reset