K-Net: Towards Unified Image Segmentation

by   Wenwei Zhang, et al.

Semantic, instance, and panoptic segmentations have been addressed using different and specialized frameworks despite their underlying connections. This paper presents a unified, simple, and effective framework for these essentially similar tasks. The framework, named K-Net, segments both instances and semantic categories consistently by a group of learnable kernels, where each kernel is responsible for generating a mask for either a potential instance or a stuff class. To remedy the difficulties of distinguishing various instances, we propose a kernel update strategy that enables each kernel dynamic and conditional on its meaningful group in the input image. K-Net can be trained in an end-to-end manner with bipartite matching, and its training and inference are naturally NMS-free and box-free. Without bells and whistles, K-Net surpasses all previous state-of-the-art single-model results of panoptic segmentation on MS COCO and semantic segmentation on ADE20K with 52.1 54.3 with Cascade Mask R-CNNon MS COCO with 60 and models will be released at https://github.com/open-mmlab/mmdetection.


