Learning Self-Supervised Representations from Vision and Touch for Active Sliding Perception of Deformable Surfaces

09/26/2022
by   Justin Kerr, et al.
0

Humans make extensive use of vision and touch as complementary senses, with vision providing global information about the scene and touch measuring local information during manipulation without suffering from occlusions. In this work, we propose a novel framework for learning multi-task visuo-tactile representations in a self-supervised manner. We design a mechanism which enables a robot to autonomously collect spatially aligned visual and tactile data, a key property for downstream tasks. We then train visual and tactile encoders to embed these paired sensory inputs into a shared latent space using cross-modal contrastive loss. The learned representations are evaluated without fine-tuning on 5 perception and control tasks involving deformable surfaces: tactile classification, contact localization, anomaly detection (e.g., surgical phantom tumor palpation), tactile search from a visual query (e.g., garment feature localization under occlusion), and tactile servoing along cloth edges and cables. The learned representations achieve an 80 feature classification, a 73 surgical materials, a 100 search, and 87.8 results suggest the flexibility of the learned representations and pose a step toward task-agnostic visuo-tactile representation learning for robot control.

READ FULL TEXT

page 1

page 3

page 4

page 5

page 6

research
03/31/2022

Visual-Tactile Multimodality for Following Deformable Linear Objects Using Reinforcement Learning

Manipulation of deformable objects is a challenging task for a robot. It...
research
09/30/2022

Visuo-Tactile Transformers for Manipulation

Learning representations in the joint domain of vision and touch can imp...
research
03/21/2023

Dexterity from Touch: Self-Supervised Pre-Training of Tactile Representations with Robotic Play

Teaching dexterity to multi-fingered robots has been a longstanding chal...
research
06/14/2019

Connecting Touch and Vision via Cross-Modal Prediction

Humans perceive the world using multi-modal sensory inputs such as visio...
research
02/21/2018

ViTac: Feature Sharing between Vision and Tactile Sensing for Cloth Texture Recognition

Vision and touch are two of the important sensing modalities for humans ...
research
12/28/2021

Multimodal perception for dexterous manipulation

Humans usually perceive the world in a multimodal way that vision, touch...
research
02/02/2022

VIRDO: Visio-tactile Implicit Representations of Deformable Objects

Deformable object manipulation requires computationally efficient repres...

Please sign up or login with your details

Forgot password? Click here to reset