Sim2Real Docs: Domain Randomization for Documents in Natural Scenes using Ray-traced Rendering

by   Nikhil Maddikunta, et al.

In the past, computer vision systems for digitized documents could rely on systematically captured, high-quality scans. Today, transactions involving digital documents are more likely to start as mobile phone photo uploads taken by non-professionals. As such, computer vision for document automation must now account for documents captured in natural scene contexts. An additional challenge is that task objectives for document processing can be highly use-case specific, which makes publicly-available datasets limited in their utility, while manual data labeling is also costly and poorly translates between use cases. To address these issues we created Sim2Real Docs - a framework for synthesizing datasets and performing domain randomization of documents in natural scenes. Sim2Real Docs enables programmatic 3D rendering of documents using Blender, an open source tool for 3D modeling and ray-traced rendering. By using rendering that simulates physical interactions of light, geometry, camera, and background, we synthesize datasets of documents in a natural scene context. Each render is paired with use-case specific ground truth data specifying latent characteristics of interest, producing unlimited fit-for-task training data. The role of machine learning models is then to solve the inverse problem posed by the rendering pipeline. Such models can be further iterated upon with real-world data by either fine tuning or making adjustments to domain randomization parameters.


page 2

page 3


Photo-realistic Neural Domain Randomization

Synthetic data is a scalable alternative to manual supervision, but it r...

Configurable, Photorealistic Image Rendering and Ground Truth Synthesis by Sampling Stochastic Grammars Representing Indoor Scenes

We propose the configurable rendering of massive quantities of photoreal...

Procedural Generation and Rendering of Realistic, Navigable Forest Environments: An Open-Source Tool

Simulation of forest environments has applications from entertainment an...

Spectral Domain Decomposition Method for Natural Lighting and Medieval Glass Rendering

In this paper, we use an original ray-tracing domain decomposition metho...

Automatic Generation of Synthetic Colonoscopy Videos for Domain Randomization

An increasing number of colonoscopic guidance and assistance systems rel...

Image Processing Based Scene-Text Detection and Recognition with Tesseract

Text Recognition is one of the challenging tasks of computer vision with...

Toward a Procedural Fruit Tree Rendering Framework for Image Analysis

We propose a procedural fruit tree rendering framework, based on Blender...

Please sign up or login with your details

Forgot password? Click here to reset