Measuring Human Perception to Improve Handwritten Document Transcription

by   Samuel Grieggs, et al.

The subtleties of human perception, as measured by vision scientists through the use of psychophysics, are important clues to the internal workings of visual recognition. For instance, measured reaction time can indicate whether a visual stimulus is easy for a subject to recognize, or whether it is hard. In this paper, we consider how to incorporate psychophysical measurements of visual perception into the loss function of a deep neural network being trained for a recognition task, under the assumption that such information can enforce consistency with human behavior. As a case study to assess the viability of this approach, we look at the problem of handwritten document transcription. While good progress has been made towards automatically transcribing modern handwriting, significant challenges remain in transcribing historical documents. Here we work towards a comprehensive transcription solution for Medieval manuscripts that combines networks trained using our novel loss formulation with natural language processing elements. In a baseline assessment, reliable performance is demonstrated for the standard IAM and RIMES datasets. Further, we go on to show feasibility for our approach on a previously published dataset and a new dataset of digitized Latin manuscripts, originally produced by scribes in the Cloister of St. Gall around the middle of the 9th century.


page 1

page 2

page 4

page 5

page 7

page 8

page 9

page 10


Measuring Human Perception to Improve Open Set Recognition

The human ability to recognize when an object is known or novel currentl...

Evaluation of a Region Proposal Architecture for Multi-task Document Layout Analysis

Automatically recognizing the layout of handwritten documents is an impo...

CITlab ARGUS for historical handwritten documents

We describe CITlab's recognition system for the HTRtS competition attach...

Asking questions on handwritten document collections

This work addresses the problem of Question Answering (QA) on handwritte...

Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers

The massive amounts of digitized historical documents acquired over the ...

Unsupervised Deep Learning for Handwritten Page Segmentation

Segmenting handwritten document images into regions with homogeneous pat...

Indoor Space Recognition using Deep Convolutional Neural Network: A Case Study at MIT Campus

In this paper, we propose a robust and parsimonious approach using Deep ...

Please sign up or login with your details

Forgot password? Click here to reset