Box2Poly: Memory-Efficient Polygon Prediction of Arbitrarily Shaped and Rotated Text

09/20/2023
by   Xuyang Chen, et al.
0

Recently, Transformer-based text detection techniques have sought to predict polygons by encoding the coordinates of individual boundary vertices using distinct query features. However, this approach incurs a significant memory overhead and struggles to effectively capture the intricate relationships between vertices belonging to the same instance. Consequently, irregular text layouts often lead to the prediction of outlined vertices, diminishing the quality of results. To address these challenges, we present an innovative approach rooted in Sparse R-CNN: a cascade decoding pipeline for polygon prediction. Our method ensures precision by iteratively refining polygon predictions, considering both the scale and location of preceding results. Leveraging this stabilized regression pipeline, even employing just a single feature vector to guide polygon instance regression yields promising detection results. Simultaneously, the leverage of instance-level feature proposal substantially enhances memory efficiency (>50 method DPText-DETR) and reduces inference speed (>40 with minor performance drop on benchmarks.

READ FULL TEXT

page 1

page 6

research
02/17/2020

Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting

Many approaches have recently been proposed to detect irregular scene te...
research
07/13/2021

Bidirectional Regression for Arbitrary-Shaped Text Detection

Arbitrary-shaped text detection has recently attracted increasing intere...
research
07/15/2022

Decoupling Recognition from Detection: Single Shot Self-Reliant Scene Text Spotter

Typical text spotters follow the two-stage spotting strategy: detect the...
research
12/15/2021

SPTS: Single-Point Text Spotting

Almost all scene text spotting (detection and recognition) methods rely ...
research
09/11/2017

Fused Text Segmentation Networks for Multi-oriented Scene Text Detection

In this paper, we introduce a novel end-end framework for multi-oriented...
research
03/29/2022

Few Could Be Better Than All: Feature Sampling and Grouping for Scene Text Detection

Recently, transformer-based methods have achieved promising progresses i...
research
01/04/2022

Learning Quality-aware Representation for Multi-person Pose Regression

Off-the-shelf single-stage multi-person pose regression methods generall...

Please sign up or login with your details

Forgot password? Click here to reset