Transformer-Based Visual Segmentation: A Survey

04/19/2023
by   Xiangtai Li, et al.
24

Visual segmentation seeks to partition images, video frames, or point clouds into multiple segments or groups. This technique has numerous real-world applications, such as autonomous driving, image editing, robot sensing, and medical analysis. Over the past decade, deep learning-based methods have made remarkable strides in this area. Recently, transformers, a type of neural network based on self-attention originally designed for natural language processing, have considerably surpassed previous convolutional or recurrent approaches in various vision processing tasks. Specifically, vision transformers offer robust, unified, and even simpler solutions for various segmentation tasks. This survey provides a thorough overview of transformer-based visual segmentation, summarizing recent advancements. We first review the background, encompassing problem definitions, datasets, and prior convolutional methods. Next, we summarize a meta-architecture that unifies all recent transformer-based approaches. Based on this meta-architecture, we examine various method designs, including modifications to the meta-architecture and associated applications. We also present several closely related settings, including 3D point cloud segmentation, foundation model tuning, domain-aware segmentation, efficient segmentation, and medical segmentation. Additionally, we compile and re-evaluate the reviewed methods on several well-established datasets. Finally, we identify open challenges in this field and propose directions for future research. The project page can be found at https://github.com/lxtGH/Awesome-Segmenation-With-Transformer. We will also continually monitor developments in this rapidly evolving field.

READ FULL TEXT

page 3

page 8

page 12

research
05/16/2022

Transformers in 3D Point Clouds: A Survey

In recent years, Transformer models have been proven to have the remarka...
research
01/09/2023

Advances in Medical Image Analysis with Vision Transformers: A Comprehensive Review

The remarkable performance of the Transformer architecture in natural la...
research
07/02/2021

A Survey on Deep Learning Technique for Video Segmentation

Video segmentation, i.e., partitioning video frames into multiple segmen...
research
01/04/2021

Transformers in Vision: A Survey

Astounding results from transformer models on natural language tasks hav...
research
06/29/2022

The Lighter The Better: Rethinking Transformers in Medical Image Segmentation Through Adaptive Pruning

Vision transformers have recently set off a new wave in the field of med...
research
03/23/2023

Position-Guided Point Cloud Panoptic Segmentation Transformer

DEtection TRansformer (DETR) started a trend that uses a group of learna...
research
06/30/2023

Transformers in Healthcare: A Survey

With Artificial Intelligence (AI) increasingly permeating various aspect...

Please sign up or login with your details

Forgot password? Click here to reset