TriTransNet: RGB-D Salient Object Detection with a Triplet Transformer Embedding Network

08/09/2021
by   Zhengyi Liu, et al.
1

Salient object detection is the pixel-level dense prediction task which can highlight the prominent object in the scene. Recently U-Net framework is widely used, and continuous convolution and pooling operations generate multi-level features which are complementary with each other. In view of the more contribution of high-level features for the performance, we propose a triplet transformer embedding module to enhance them by learning long-range dependencies across layers. It is the first to use three transformer encoders with shared weights to enhance multi-level features. By further designing scale adjustment module to process the input, devising three-stream decoder to process the output and attaching depth features to color features for the multi-modal fusion, the proposed triplet transformer embedding network (TriTransNet) achieves the state-of-the-art performance in RGB-D salient object detection, and pushes the performance to a new level. Experimental results demonstrate the effectiveness of the proposed modules and the competition of TriTransNet.

READ FULL TEXT

page 4

page 8

research
11/12/2022

Multistep feature aggregation framework for salient object detection

Recent works on salient object detection have made use of multi-scale fe...
research
03/21/2022

GroupTransNet: Group Transformer Network for RGB-D Salient Object Detection

Salient object detection on RGB-D images is an active topic in computer ...
research
07/06/2020

BBS-Net: RGB-D Salient Object Detection with a Bifurcated Backbone Strategy Network

Multi-level feature fusion is a fundamental topic in computer vision for...
research
04/18/2019

Cascaded Partial Decoder for Fast and Accurate Salient Object Detection

Existing state-of-the-art salient object detection networks rely on aggr...
research
05/21/2020

Instance-aware Image Colorization

Image colorization is inherently an ill-posed problem with multi-modal u...
research
11/04/2022

OSIC: A New One-Stage Image Captioner Coined

Mainstream image caption models are usually two-stage captioners, i.e., ...
research
05/03/2021

CMA-Net: A Cascaded Mutual Attention Network for Light Field Salient Object Detection

In the past few years, numerous deep learning methods have been proposed...

Please sign up or login with your details

Forgot password? Click here to reset