We examined whether embedding human attention knowledge into saliency-ba...
We propose the gradient-weighted Object Detector Activation Maps (ODAM),...
In this paper, we study masked autoencoder (MAE) pretraining on videos f...
Recent years have seen the ever-increasing importance of pre-trained mod...
The adversarial vulnerability of deep neural networks (DNNs) has been
ac...
Pool-based Active Learning (AL) has achieved great success in minimizing...
Multi-view crowd counting has been previously proposed to utilize
multi-...
Recent image captioning models are achieving impressive results based on...
Active Learning (AL) is a set of techniques for reducing labeling cost b...
Recent works on 3D single object tracking treat the tracking as a
target...
Social distancing, an essential public health measure to limit the sprea...
Describing images using natural language is widely known as image captio...
Active learning aims to achieve greater accuracy with less training data...
Deep neural networks with batch normalization (BN-DNNs) are invariant to...
Nested networks or slimmable networks are neural networks whose architec...
Crowd counting in single-view images has achieved outstanding performanc...
Using weight decay to penalize the L2 norms of weights in neural network...
State-of-the-art multi-object tracking (MOT) methods follow the
tracking...
A wide range of image captioning models has been developed, achieving
si...
Current crowd counting algorithms are only concerned about the number of...
Multi-camera surveillance has been an active research topic for understa...
In recent years, vision-based crowd analysis has been studied extensivel...
Crowd counting has been studied for decades and a lot of works have achi...
Although significant progress has been made in the field of automatic im...
Online updating a tracking model to adapt to object appearance variation...
Template-matching methods for visual tracking have gained popularity rec...
Estimating the uncertainty of a Bayesian model has been investigated for...
Recently, the state-of-the-art models for image captioning have overtake...
Generative dynamic texture models (GDTMs) are widely used for dynamic te...
Attention modules connecting encoder and decoders have been widely appli...
Eye Movement analysis with Hidden Markov Models (EMHMM) is a method for
...
Image captioning is a challenging task that combines the field of comput...
Template-matching methods for visual tracking have gained popularity rec...
Recently using convolutional neural networks (CNNs) has gained popularit...
For crowded scenes, the accuracy of object-based computer vision methods...
Color theme or color palette can deeply influence the quality and the fe...
Computer vision tasks often have side information available that is help...
We propose an heterogeneous multi-task learning framework for human pose...
We present a multiple-person tracking algorithm, based on combining part...
A generalized Gaussian process model (GGPM) is a unifying framework that...
The hidden Markov model (HMM) is a widely-used generative model that cop...
The hidden Markov model (HMM) is a generative model that treats sequenti...