Does computer vision matter for action?

05/30/2019
by   Brady Zhou, et al.
0

Computer vision produces representations of scene content. Much computer vision research is predicated on the assumption that these intermediate representations are useful for action. Recent work at the intersection of machine learning and robotics calls this assumption into question by training sensorimotor systems directly for the task at hand, from pixels to actions, with no explicit intermediate representations. Thus the central question of our work: Does computer vision matter for action? We probe this question and its offshoots via immersive simulation, which allows us to conduct controlled reproducible experiments at scale. We instrument immersive three-dimensional environments to simulate challenges such as urban driving, off-road trail traversal, and battle. Our main finding is that computer vision does matter. Models equipped with intermediate representations train faster, achieve higher task performance, and generalize better to previously unseen environments. A video that summarizes the work and illustrates the results can be found at https://youtu.be/4MfWa2yZ0Jc

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

page 6

page 8

page 9

research
01/02/2004

Cyborg Systems as Platforms for Computer-Vision Algorithm-Development for Astrobiology

Employing the allegorical imagery from the film "The Matrix", we motivat...
research
07/22/2019

Adapting Computer Vision Algorithms for Omnidirectional Video

Omnidirectional (360) video has got quite popular because it provides a ...
research
05/11/2017

Negative Results in Computer Vision: A Perspective

A negative result is when the outcome of an experiment or a model is not...
research
02/13/2023

Implications of the Convergence of Language and Vision Model Geometries

Large-scale pretrained language models (LMs) are said to “lack the abili...
research
11/14/2019

VisionISP: Repurposing the Image Signal Processor for Computer Vision Applications

Traditional image signal processors (ISPs) are primarily designed and op...
research
04/07/2022

Total Variation Optimization Layers for Computer Vision

Optimization within a layer of a deep-net has emerged as a new direction...
research
04/30/2021

DriveGAN: Towards a Controllable High-Quality Neural Simulation

Realistic simulators are critical for training and verifying robotics sy...

Please sign up or login with your details

Forgot password? Click here to reset