We present DFormer, a novel RGB-D pretraining framework to learn transfe...
Recent advancements in diffusion models have showcased their impressive
...
We aim at providing the object detection community with an efficient and...
In this paper, we consider the problem of referring camouflaged object
d...
In this paper, we present a simple but performant semi-supervised semant...
Understanding whether self-supervised learning methods can scale with
un...
We present All-Pairs Multi-Field Transforms (AMT), a new network archite...
A significant research effort is focused on exploiting the amazing capac...
Previous works have shown that increasing the window size for
Transforme...
Traffic scene parsing is one of the most important tasks to achieve
inte...
Contrastive Masked Autoencoder (CMAE), as a new self-supervised framewor...
Ensemble learning serves as a straightforward way to improve the perform...
How to identify and segment camouflaged objects from the background is
c...
This paper does not attempt to design a state-of-the-art method for visu...
Masked image modeling (MIM) has achieved promising results on various vi...
Mining precise class-aware attention maps, a.k.a, class activation maps,...
Visual recognition has been dominated by convolutional neural networks (...
In this paper, we present Vision Permutator, a conceptually simple and d...
Modern pre-trained language models are mostly built upon backbones stack...
This paper provides a strong baseline for vision transformers on the Ima...
Detecting transparent objects in natural scenes is challenging due to th...
Vision transformers (ViTs) have been successfully applied in image
class...
Current neural architecture search (NAS) algorithms still require expert...
Recent studies on mobile network design have demonstrated the remarkable...
Label smoothing is an effective regularization tool for deep neural netw...
Benefiting from the capability of building inter-dependencies among chan...
The inverted residual block is dominating architecture design for mobile...
Object region mining is a critical step for weakly-supervised semantic
s...
In this paper, we solve three low-level pixel-wise vision problems, incl...
Spatial pooling has been proven highly effective in capturing long-range...
Previous adversarial domain alignment methods for unsupervised domain
ad...
Feature pyramid network (FPN) based models, which fuse the semantics and...
The use of RGB-D information for salient object detection has been explo...
We solve the problem of salient object detection by investigating how to...
Recently, adversarial erasing for weakly-supervised object attention has...
Current CNN-based solutions to salient object detection (SOD) mainly rel...
In this paper, we aim at solving pixel-wise binary problems, including
s...
In this paper, we improve semantic segmentation by automatically learnin...
In this paper, we provide a comprehensive evaluation of salient object
d...
In this paper, we consider an interesting vision problem---salient insta...
Benefiting from its high efficiency and simplicity, Simple Linear Iterat...
Recent progress on saliency detection is substantial, benefiting mostly ...
Detecting and segmenting salient objects in natural scenes, often referr...