Self-supervised Character-to-Character Distillation

11/01/2022
by   Tongkun Guan, et al.
0

Handling complicated text images (e.g., irregular structures, low resolution, heavy occlusion, and even illumination), existing supervised text recognition methods are data-hungry. Although these methods employ large-scale synthetic text images to reduce the dependence on annotated real images, the domain gap limits the recognition performance. Therefore, exploring the robust text feature representation on unlabeled real images by self-supervised learning is a good solution. However, existing self-supervised text recognition methods only execute sequence-to-sequence representation learning by roughly splitting the visual features along the horizontal axis, which will damage the character structures. Besides, these sequential-level self-learning methods limit the availability of geometric-based data augmentation, as large-scale geometry augmentation leads to sequence-to-sequence inconsistency. To address the above-mentioned issues, we proposed a novel self-supervised character-to-character distillation method, CCD. Specifically, we delineate the character structures of unlabeled real images by designing a self-supervised character segmentation module, and further apply the segmentation results to build character-level representation learning. CCD differs from prior works in that we propose a character-level pretext task to learn more fine-grained feature representations. Besides, compared with the inflexible augmentations of sequence-to-sequence models, our work satisfies character-to-character representation consistency, across various transformations (e.g., geometry and colour), to generate robust text features in the representative space. Experiments demonstrate that CCD achieves state-of-the-art performance on publicly available text recognition benchmarks.

READ FULL TEXT
research
05/24/2022

Multi-Augmentation for Efficient Visual Representation Learning for Self-supervised Pre-training

In recent years, self-supervised learning has been studied to deal with ...
research
07/01/2022

Reading and Writing: Discriminative and Generative Modeling for Self-Supervised Text Recognition

Existing text recognition methods usually need large-scale training data...
research
04/16/2022

BLISS: Robust Sequence-to-Sequence Learning via Self-Supervised Input Representation

Data augmentations (DA) are the cores to achieving robust sequence-to-se...
research
02/24/2022

SMILE: Sequence-to-Sequence Domain Adaption with Minimizing Latent Entropy for Text Image Recognition

Training recognition models with synthetic images have achieved remarkab...
research
08/12/2019

Self-supervised Data Bootstrapping for Deep Optical Character Recognition of Identity Documents

The essential task of verifying person identities at airports and nation...
research
07/17/2023

Revisiting Scene Text Recognition: A Data Perspective

This paper aims to re-assess scene text recognition (STR) from a data-or...
research
06/07/2022

PP-OCRv3: More Attempts for the Improvement of Ultra Lightweight OCR System

Optical character recognition (OCR) technology has been widely used in v...

Please sign up or login with your details

Forgot password? Click here to reset