Pretraining on Interactions for Learning Grounded Affordance Representations

07/05/2022
by   Jack Merullo, et al.
2

Lexical semantics and cognitive science point to affordances (i.e. the actions that objects support) as critical for understanding and representing nouns and verbs. However, study of these semantic features has not yet been integrated with the "foundation" models that currently dominate language representation research. We hypothesize that predictive modeling of object state over time will result in representations that encode object affordance information "for free". We train a neural network to predict objects' trajectories in a simulated interaction and show that our network's latent representations differentiate between both observed and unobserved affordances. We find that models trained using 3D simulations from our SPATIAL dataset outperform conventional 2D computer vision models trained on a similar task, and, on initial inspection, that differences between concepts correspond to expected features (e.g., roll entails rotation). Our results suggest a way in which modern deep learning approaches to grounded language learning can be integrated with traditional formal semantic notions of lexical representations.

READ FULL TEXT

page 2

page 4

page 6

page 13

page 17

page 18

page 19

page 20

research
06/23/2022

Do Trajectories Encode Verb Meaning?

Distributional models learn representations of words from text, but are ...
research
03/08/2023

Comparing Trajectory and Vision Modalities for Verb Representation

Three-dimensional trajectories, or the 3D position and rotation of objec...
research
05/21/2023

VL-Fields: Towards Language-Grounded Neural Implicit Spatial Representations

We present Visual-Language Fields (VL-Fields), a neural implicit spatial...
research
05/31/2017

Are distributional representations ready for the real world? Evaluating word vectors for grounded perceptual meaning

Distributional word representation methods exploit word co-occurrences t...
research
06/03/2020

CompGuessWhat?!: A Multi-task Evaluation Framework for Grounded Language Learning

Approaches to Grounded Language Learning typically focus on a single tas...
research
04/26/2021

Towards Visual Semantics

In Visual Semantics we study how humans build mental representations, i....
research
06/16/2020

The Role of Verb Semantics in Hungarian Verb-Object Order

Hungarian is often referred to as a discourse-configurational language, ...

Please sign up or login with your details

Forgot password? Click here to reset