Focusing Attention: Towards Accurate Text Recognition in Natural Images

09/07/2017
by   Zhanzhan Cheng, et al.
0

Scene text recognition has been a hot research topic in computer vision due to its various applications. The state of the art is the attention-based encoder-decoder framework that learns the mapping between input images and output sequences in a purely data-driven way. However, we observe that existing attention-based methods perform poorly on complicated and/or low-quality images. One major reason is that existing methods cannot get accurate alignments between feature areas and targets for such images. We call this phenomenon "attention drift". To tackle this problem, in this paper we propose the FAN (the abbreviation of Focusing Attention Network) method that employs a focusing attention mechanism to automatically draw back the drifted attention. FAN consists of two major components: an attention network (AN) that is responsible for recognizing character targets as in the existing methods, and a focusing network (FN) that is responsible for adjusting attention by evaluating whether AN pays attention properly on the target areas in the images. Furthermore, different from the existing methods, we adopt a ResNet-based network to enrich deep representations of scene text images. Extensive experiments on various benchmarks, including the IIIT5k, SVT and ICDAR datasets, show that the FAN method substantially outperforms the existing methods.

READ FULL TEXT

page 1

page 2

page 5

page 7

research
11/12/2017

Arbitrarily-Oriented Text Recognition

Recognizing text from natural images is still a hot research topic in co...
research
10/19/2020

Gaussian Constrained Attention Network for Scene Text Recognition

Scene text recognition has been a hot topic in computer vision. Recent m...
research
01/10/2019

A Multi-Object Rectified Attention Network for Scene Text Recognition

Irregular text is widely used. However, it is considerably difficult to ...
research
09/05/2022

Scene Text Recognition with Single-Point Decoding Network

In recent years, attention-based scene text recognition methods have bee...
research
01/17/2019

SAFE: Scale Aware Feature Encoder for Scene Text Recognition

In this paper, we address the problem of having characters with differen...
research
10/07/2019

MASTER: Multi-Aspect Non-local Network for Scene Text Recognition

Attention based scene text recognizers have gained huge success, which l...
research
05/08/2020

On Vocabulary Reliance in Scene Text Recognition

The pursuit of high performance on public benchmarks has been the drivin...

Please sign up or login with your details

Forgot password? Click here to reset