Vision-specific concepts such as "region" have played a key role in exte...
In this paper, we propose a simple attention mechanism, we call
Box-Atte...
This paper targets at visual counting, where the setup is to estimate th...
In this paper, we explore how three related tasks, namely keypoint detec...
It is still challenging to build an AI system that can perform tasks tha...
A key solution to visual question answering (VQA) exists in how to fuse
...