Evaluation of HTR models without Ground Truth Material

The evaluation of Handwritten Text Recognition (HTR) models during their development is straightforward: because HTR is a supervised problem, the usual data split into training, validation, and test data sets allows the evaluation of models in terms of accuracy or error rates. However, the evaluation process becomes tricky as soon as we switch from development to application. A compilation of a new (and forcibly smaller) ground truth (GT) from a sample of the data that we want to apply the model on and the subsequent evaluation of models thereon only provides hints about the quality of the recognised text, as do confidence scores (if available) the models return. Moreover, if we have several models at hand, we face a model selection problem since we want to obtain the best possible result during the application phase. This calls for GT-free metrics to select the best model, which is why we (re-)introduce and compare different metrics, from simple, lexicon-based to more elaborate ones using standard language models and masked language models (MLM). We show that MLM-based evaluation can compete with lexicon-based methods, with the advantage that large and multilingual transformers are readily available, thus making compiling lexical resources for other metrics superfluous.

READ FULL TEXT

page 4

page 5

research
09/15/2022

Distribution Aware Metrics for Conditional Natural Language Generation

Traditional automated metrics for evaluating conditional natural languag...
research
11/22/2020

Registration of serial sections: An evaluation method based on distortions of the ground truths

Registration of histological serial sections is a challenging task. Seri...
research
11/19/2022

Towards good validation metrics for generative models in offline model-based optimisation

In this work we propose a principled evaluation framework for model-base...
research
12/16/2022

Lessons learned from the evaluation of Spanish Language Models

Given the impact of language models on the field of Natural Language Pro...
research
07/27/2023

Metric-Based In-context Learning: A Case Study in Text Simplification

In-context learning (ICL) for large language models has proven to be a p...
research
06/15/2023

LOVM: Language-Only Vision Model Selection

Pre-trained multi-modal vision-language models (VLMs) are becoming incre...
research
12/15/2017

Transfer Learning for OCRopus Model Training on Early Printed Books

A method is presented that significantly reduces the character error rat...

Please sign up or login with your details

Forgot password? Click here to reset