ICDAR 2023 Competition on Robust Layout Segmentation in Corporate Documents

by   Christoph Auer, et al.

Transforming documents into machine-processable representations is a challenging task due to their complex structures and variability in formats. Recovering the layout structure and content from PDF files or scanned material has remained a key problem for decades. ICDAR has a long tradition in hosting competitions to benchmark the state-of-the-art and encourage the development of novel solutions to document layout understanding. In this report, we present the results of our ICDAR 2023 Competition on Robust Layout Segmentation in Corporate Documents, which posed the challenge to accurately segment the page layout in a broad range of document styles and domains, including corporate reports, technical literature and patents. To raise the bar over previous competitions, we engineered a hard competition dataset and proposed the recent DocLayNet dataset for training. We recorded 45 team registrations and received official submissions from 21 teams. In the presented solutions, we recognize interesting combinations of recent computer vision models, data augmentation strategies and ensemble methods to achieve remarkable accuracy in the task we posed. A clear trend towards adoption of vision-transformer based methods is evident. The results demonstrate substantial progress towards achieving robust and highly generalizing methods for document layout understanding.


page 1

page 2

page 3

page 4


WeLayout: WeChat Layout Analysis System for the ICDAR 2023 Competition on Robust Layout Segmentation in Corporate Documents

In this paper, we introduce WeLayout, a novel system for segmenting the ...

Ensemble of Anchor-Free Models for Robust Bangla Document Layout Segmentation

In this research paper, we introduce a novel approach designed for the p...

The Law of Large Documents: Understanding the Structure of Legal Contracts Using Visual Cues

Large, pre-trained transformer models like BERT have achieved state-of-t...

Kleister: A novel task for Information Extraction involving Long Documents with Complex Layout

State-of-the-art solutions for Natural Language Processing (NLP) are abl...

A Graphical Approach to Document Layout Analysis

Document layout analysis (DLA) is the task of detecting the distinct, se...

Page Layout Analysis of Text-heavy Historical Documents: a Comparison of Textual and Visual Approaches

Page layout analysis is a fundamental step in document processing which ...

Delivering Document Conversion as a Cloud Service with High Throughput and Responsiveness

Document understanding is a key business process in the data-driven econ...

Please sign up or login with your details

Forgot password? Click here to reset