Chinese Text Recognition with A Pre-Trained CLIP-Like Model Through Image-IDS Aligning

09/03/2023
by   Haiyang Yu, et al.
0

Scene text recognition has been studied for decades due to its broad applications. However, despite Chinese characters possessing different characteristics from Latin characters, such as complex inner structures and large categories, few methods have been proposed for Chinese Text Recognition (CTR). Particularly, the characteristic of large categories poses challenges in dealing with zero-shot and few-shot Chinese characters. In this paper, inspired by the way humans recognize Chinese texts, we propose a two-stage framework for CTR. Firstly, we pre-train a CLIP-like model through aligning printed character images and Ideographic Description Sequences (IDS). This pre-training stage simulates humans recognizing Chinese characters and obtains the canonical representation of each character. Subsequently, the learned representations are employed to supervise the CTR model, such that traditional single-character recognition can be improved to text-line recognition through image-IDS matching. To evaluate the effectiveness of the proposed method, we conduct extensive experiments on both Chinese character recognition (CCR) and CTR. The experimental results demonstrate that the proposed method performs best in CCR and outperforms previous methods in most scenarios of the CTR benchmark. It is worth noting that the proposed method can recognize zero-shot Chinese characters in text images without fine-tuning, whereas previous methods require fine-tuning when new classes appear. The code is available at https://github.com/FudanVI/FudanOCR/tree/main/image-ids-CTR.

READ FULL TEXT
research
06/22/2021

Zero-Shot Chinese Character Recognition with Stroke-Level Decomposition

Chinese character recognition has attracted much research interest due t...
research
08/04/2023

Universal Defensive Underpainting Patch: Making Your Text Invisible to Optical Character Recognition

Optical Character Recognition (OCR) enables automatic text extraction fr...
research
07/09/2020

Maximum Entropy Regularization and Chinese Text Recognition

Chinese text recognition is more challenging than Latin text due to the ...
research
07/17/2022

Stroke-Based Autoencoders: Self-Supervised Learners for Efficient Zero-Shot Chinese Character Recognition

Chinese characters carry a wealth of morphological and semantic informat...
research
04/30/2022

SVTR: Scene Text Recognition with a Single Visual Model

Dominant scene text recognition models commonly contain two building blo...
research
07/10/2023

Stroke Extraction of Chinese Character Based on Deep Structure Deformable Image Registration

Stroke extraction of Chinese characters plays an important role in the f...
research
07/21/2023

Character Time-series Matching For Robust License Plate Recognition

Automatic License Plate Recognition (ALPR) is becoming a popular study a...

Please sign up or login with your details

Forgot password? Click here to reset