Vision-language models (VLMs) have recently demonstrated strong efficacy...
We present a novel framework for probing and improving relational,
compo...
If a Large Language Model (LLM) answers "yes" to the question "Are mount...
Current pre-trained language models have lots of knowledge, but a more
l...
Can we develop visually grounded dialog agents that can efficiently adap...
Consider a collaborative task that requires communication. Two agents ar...
We propose a technique for making Convolutional Neural Network (CNN)-bas...
Neural sequence models are widely used to model time-series data in many...
We propose a technique for producing "visual explanations" for decisions...
Many practical perception systems exist within larger processes that inc...
Convolutional Neural Networks have achieved state-of-the-art performance...
One major challenge in training Deep Neural Networks is preventing
overf...
We present a two-module approach to semantic segmentation that incorpora...