Transformer-Based UNet with Multi-Headed Cross-Attention Skip Connections to Eliminate Artifacts in Scanned Documents

06/05/2023
by   David Kreuzer, et al.
0

The extraction of text in high quality is essential for text-based document analysis tasks like Document Classification or Named Entity Recognition. Unfortunately, this is not always ensured, as poor scan quality and the resulting artifacts lead to errors in the Optical Character Recognition (OCR) process. Current approaches using Convolutional Neural Networks show promising results for background removal tasks but fail correcting artifacts like pixelation or compression errors. For general images, Transformer backbones are getting integrated more frequently in well-known neural network structures for denoising tasks. In this work, a modified UNet structure using a Swin Transformer backbone is presented to remove typical artifacts in scanned documents. Multi-headed cross-attention skip connections are used to more selectively learn features in respective levels of abstraction. The performance of this approach is examined regarding compression errors, pixelation and random noise. An improvement in text extraction quality with a reduced error rate of up to 53.9 base-model can be easily adapted to new artifacts. The cross-attention skip connections allow to integrate textual information extracted from the encoder or in form of commands to more selectively control the models outcome. The latter is shown by means of an example application.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/03/2022

EraseNet: A Recurrent Residual Network for Supervised Document Cleaning

Document denoising is considered one of the most challenging tasks in co...
research
12/08/2021

Transformer-Based Approach for Joint Handwriting and Named Entity Recognition in Historical documents

The extraction of relevant information carried out by named entities in ...
research
10/15/2019

DeepErase: Weakly Supervised Ink Artifact Removal in Document Text Images

Paper-intensive industries like insurance, law, and government have long...
research
06/14/2023

Research on Named Entity Recognition in Improved transformer with R-Drop structure

To enhance the generalization ability of the model and improve the effec...
research
05/04/2023

The Role of Global and Local Context in Named Entity Recognition

Pre-trained transformer-based models have recently shown great performan...
research
04/26/2023

Key-value information extraction from full handwritten pages

We propose a Transformer-based approach for information extraction from ...

Please sign up or login with your details

Forgot password? Click here to reset