Parsing Table Structures in the Wild

09/06/2021
by   Rujiao Long, et al.
10

This paper tackles the problem of table structure parsing (TSP) from images in the wild. In contrast to existing studies that mainly focus on parsing well-aligned tabular images with simple layouts from scanned PDF documents, we aim to establish a practical table structure parsing system for real-world scenarios where tabular input images are taken or scanned with severe deformation, bending or occlusions. For designing such a system, we propose an approach named Cycle-CenterNet on the top of CenterNet with a novel cycle-pairing module to simultaneously detect and group tabular cells into structured tables. In the cycle-pairing module, a new pairing loss function is proposed for the network training. Alongside with our Cycle-CenterNet, we also present a large-scale dataset, named Wired Table in the Wild (WTW), which includes well-annotated structure parsing of multiple style tables in several scenes like the photo, scanning files, web pages, etc.. In experiments, we demonstrate that our Cycle-CenterNet consistently achieves the best accuracy of table structure parsing on the new WTW dataset by 24.6% absolute improvement evaluated by the TEDS metric. A more comprehensive experimental analysis also validates the advantages of our proposed methods for the TSP task.

READ FULL TEXT

page 1

page 4

page 5

page 8

research
08/13/2019

Complicated Table Structure Recognition

The task of table structure recognition aims to recognize the internal s...
research
03/27/2023

A large-scale dataset for end-to-end table recognition in the wild

Table recognition (TR) is one of the research hotspots in pattern recogn...
research
01/05/2022

TableParser: Automatic Table Parsing with Weak Supervision from Spreadsheets

Tables have been an ever-existing structure to store data. There exist n...
research
03/17/2022

Robust Table Detection and Structure Recognition from Heterogeneous Document Images

We introduce a new table detection and structure recognition approach na...
research
08/09/2022

TSRFormer: Table Structure Recognition with Transformers

We present a new table structure recognition (TSR) approach, called TSRF...
research
03/21/2023

Robust Table Structure Recognition with Dynamic Queries Enhanced Detection Transformer

We present a new table structure recognition (TSR) approach, called TSRF...
research
11/05/2019

DocParser: Hierarchical Structure Parsing of Document Renderings

Translating document renderings (e.g. PDFs, scans) into hierarchical str...

Please sign up or login with your details

Forgot password? Click here to reset