TSR-DSAW: Table Structure Recognition via Deep Spatial Association of Words

by   Arushi Jain, et al.

Existing methods for Table Structure Recognition (TSR) from camera-captured or scanned documents perform poorly on complex tables consisting of nested rows / columns, multi-line texts and missing cell data. This is because current data-driven methods work by simply training deep models on large volumes of data and fail to generalize when an unseen table structure is encountered. In this paper, we propose to train a deep network to capture the spatial associations between different word pairs present in the table image for unravelling the table structure. We present an end-to-end pipeline, named TSR-DSAW: TSR via Deep Spatial Association of Words, which outputs a digital representation of a table image in a structured format such as HTML. Given a table image as input, the proposed method begins with the detection of all the words present in the image using a text-detection network like CRAFT which is followed by the generation of word-pairs using dynamic programming. These word-pairs are highlighted in individual images and subsequently, fed into a DenseNet-121 classifier trained to capture spatial associations such as same-row, same-column, same-cell or none. Finally, we perform post-processing on the classifier output to generate the table structure in HTML format. We evaluate our TSR-DSAW pipeline on two public table-image datasets – PubTabNet and ICDAR 2013, and demonstrate improvement over previous methods such as TableNet and DeepDeSRT.


page 1

page 2

page 3

page 4


TGRNet: A Table Graph Reconstruction Network for Table Structure Recognition

A table arranging data in rows and columns is a very effective data stru...

Table Structure Recognition using Top-Down and Bottom-Up Cues

Tables are information-rich structured objects in document images. While...

Grab What You Need: Rethinking Complex Table Structure Recognition with Flexible Components Deliberation

Recently, Table Structure Recognition (TSR) task, aiming at identifying ...

Efficient Information Sharing in ICT Supply Chain Social Network via Table Structure Recognition

The global Information and Communications Technology (ICT) supply chain ...

Motion-Based Handwriting Recognition and Word Reconstruction

In this project, we leverage a trained single-letter classifier to predi...

TabAug: Data Driven Augmentation for Enhanced Table Structure Recognition

Table Structure Recognition is an essential part of end-to-end tabular d...

Please sign up or login with your details

Forgot password? Click here to reset