Style-transfer based Speech and Audio-visual Scene Understanding for Robot Action Sequence Acquisition from Videos

by   Chiori Hori, et al.

To realize human-robot collaboration, robots need to execute actions for new tasks according to human instructions given finite prior knowledge. Human experts can share their knowledge of how to perform a task with a robot through multi-modal instructions in their demonstrations, showing a sequence of short-horizon steps to achieve a long-horizon goal. This paper introduces a method for robot action sequence generation from instruction videos using (1) an audio-visual Transformer that converts audio-visual features and instruction speech to a sequence of robot actions called dynamic movement primitives (DMPs) and (2) style-transfer-based training that employs multi-task learning with video captioning and weakly-supervised learning with a semantic classifier to exploit unpaired video-action data. We built a system that accomplishes various cooking actions, where an arm robot executes a DMP sequence acquired from a cooking video using the audio-visual Transformer. Experiments with Epic-Kitchen-100, YouCookII, QuerYD, and in-house instruction video datasets show that the proposed method improves the quality of DMP sequences by 2.3 times the METEOR score obtained with a baseline video-to-action Transformer. The model achieved 32 object.


Understanding of Object Manipulation Actions Using Human Multi-Modal Sensory Data

Object manipulation actions represent an important share of the Activiti...

Language-based Video Editing via Multi-Modal Multi-Level Transformer

Video editing tools are widely used nowadays for digital design. Althoug...

Learning Action Conditions from Instructional Manuals for Instruction Understanding

The ability to infer pre- and postconditions of an action is vital for c...

Online Motion Generation with Sensory Information and Instructions by Hierarchical RNN

This paper proposes an approach for robots to perform co-working task al...

Please sign up or login with your details

Forgot password? Click here to reset