Robot Object Retrieval with Contextual Natural Language Queries

06/23/2020
by   Thao Nguyen, et al.
7

Natural language object retrieval is a highly useful yet challenging task for robots in human-centric environments. Previous work has primarily focused on commands specifying the desired object's type such as "scissors" and/or visual attributes such as "red," thus limiting the robot to only known object classes. We develop a model to retrieve objects based on descriptions of their usage. The model takes in a language command containing a verb, for example "Hand me something to cut," and RGB images of candidate objects and selects the object that best satisfies the task specified by the verb. Our model directly predicts an object's appearance from the object's use specified by a verb phrase. We do not need to explicitly specify an object's class label. Our approach allows us to predict high level concepts like an object's utility based on the language query. Based on contextual information present in the language commands, our model can generalize to unseen object classes and unknown nouns in the commands. Our model correctly selects objects out of sets of five candidates to fulfill natural language commands, and achieves an average accuracy of 62.3 a held-out test set of unseen ImageNet object classes and 53.0 object classes and unknown nouns. Our model also achieves an average accuracy of 54.7 distribution from ImageNet objects. We demonstrate our model on a KUKA LBR iiwa robot arm, enabling the robot to retrieve objects based on natural language descriptions of their usage. We also present a new dataset of 655 verb-object pairs denoting object usage over 50 verbs and 216 object classes.

READ FULL TEXT

page 1

page 6

page 7

page 8

research
05/30/2019

Grounding Language Attributes to Objects using Bayesian Eigenobjects

We develop a system to disambiguate objects based on simple physical des...
research
12/18/2019

ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language

We introduce the new task of 3D object localization in RGB-D scans using...
research
11/13/2015

Natural Language Object Retrieval

In this paper, we address the task of natural language object retrieval,...
research
09/13/2023

Language-Conditioned Observation Models for Visual Object Search

Object search is a challenging task because when given complex language ...
research
09/26/2018

Pay attention! - Robustifying a Deep Visuomotor Policy through Task-Focused Attention

Several recent projects demonstrated the promise of end-to-end learned d...
research
05/11/2022

Identifying concept libraries from language about object structure

Our understanding of the visual world goes beyond naming objects, encomp...
research
08/27/2017

One-Shot Concept Learning by Simulating Evolutionary Instinct Development

Object recognition has become a crucial part of machine learning and com...

Please sign up or login with your details

Forgot password? Click here to reset