Object understanding in egocentric visual data is arguably a fundamental...
Object detection has been expanded from a limited number of categories t...
Continual Learning, also known as Lifelong or Incremental Learning, has
...
As an important area in computer vision, object tracking has formed two
...
Efficient video architecture is the key to deploying video recognition
s...
Conventional video models rely on a single stream to capture the complex...
This work targets designing a principled and unified training-free frame...
We present Multiscale Vision Transformers (MViT) for video and image
rec...
Differential Neural Architecture Search (NAS) requires all layer choices...
The long-tail distribution of the visual world poses great challenges fo...
Understanding temporal information and how the visual world changes over...
In natural images, information is conveyed at different frequencies wher...
Motion has shown to be useful for video understanding, where motion is
t...
Globally modeling and reasoning over relations between regions can be
be...
This paper describes a procedure for the creation of large-scale video
d...
We study the problem of automatically building hypernym taxonomies from
...
State-of-the-art results of semantic segmentation are established by Ful...
Photo retouching enables photographers to invoke dramatic visual impress...
In image classification, visual separability between different object
ca...