A Conglomerate of Multiple OCR Table Detection and Extraction

10/16/2020
by   Smita Pallavi, et al.
0

Information representation as tables are compact and concise method that eases searching, indexing, and storage requirements. Extracting and cloning tables from parsable documents is easier and widely used, however industry still faces challenge in detecting and extracting tables from OCR documents or images. This paper proposes an algorithm that detects and extracts multiple tables from OCR document. The algorithm uses a combination of image processing techniques, text recognition and procedural coding to identify distinct tables in same image and map the text to appropriate corresponding cell in dataframe which can be stored as Comma-separated values, Database, Excel and multiple other usable formats.

READ FULL TEXT

page 2

page 5

page 6

page 7

page 8

research
10/10/2022

A two-stage approach for table extraction in invoices

The automated analysis of administrative documents is an important field...
research
10/23/2020

Extracting Body Text from Academic PDF Documents for Text Mining

Accurate extraction of body text from PDF-formatted academic documents i...
research
08/25/2020

CDeC-Net: Composite Deformable Cascade Network for Table Detection in Document Images

Localizing page elements/objects such as tables, figures, equations, etc...
research
02/26/2019

A framework for information extraction from tables in biomedical literature

The scientific literature is growing exponentially, and professionals ar...
research
05/18/2022

Carbon Figures of Merit Knowledge Creation with a Hybrid Solution and Carbon Tables API

Nowadays there are algorithms, methods, and platforms that are being cre...
research
04/04/2021

Faster Convolution Inference Through Using Pre-Calculated Lookup Tables

Low-cardinality activations permit an algorithm based on fetching the in...
research
07/14/2022

DEXTER: An end-to-end system to extract table contents from electronic medical health documents

In this paper, we propose DEXTER, an end to end system to extract inform...

Please sign up or login with your details

Forgot password? Click here to reset