Calibrating Structured Output Predictors for Natural Language Processing

by   Abhyuday Jagannatha, et al.
University of Massachusetts Amherst

We address the problem of calibrating prediction confidence for output entities of interest in natural language processing (NLP) applications. It is important that NLP applications such as named entity recognition and question answering produce calibrated confidence scores for their predictions, especially if the system is to be deployed in a safety-critical domain such as healthcare. However, the output space of such structured prediction models is often too large to adapt binary or multi-class calibration methods directly. In this study, we propose a general calibration scheme for output entities of interest in neural-network based structured prediction models. Our proposed method can be used with any binary class calibration scheme and a neural network model. Additionally, we show that our calibration method can also be used as an uncertainty-aware, entity-specific decoding step to improve the performance of the underlying model at no additional training cost or data requirements. We show that our method outperforms current calibration techniques for named-entity-recognition, part-of-speech and question answering. We also improve our model's performance from our decoding step across several tasks and benchmark datasets. Our method improves the calibration and model performance on out-of-domain test scenarios as well.


page 1

page 2

page 3

page 4


Investigation on Data Adaptation Techniques for Neural Named Entity Recognition

Data processing is an important step in various natural language process...

Ensemble Distillation for Structured Prediction: Calibrated, Accurate, Fast—Choose Three

Modern neural networks do not always produce well-calibrated predictions...

Confidence estimation of classification based on the distribution of the neural network output layer

One of the most common problems preventing the application of prediction...

It's better to say "I can't answer" than answering incorrectly: Towards Safety critical NLP systems

In order to make AI systems more reliable and their adoption in safety c...

Quantifying Uncertainties in Natural Language Processing Tasks

Reliable uncertainty quantification is a first step towards building exp...

Data Mining in Clinical Trial Text: Transformers for Classification and Question Answering Tasks

This research on data extraction methods applies recent advances in natu...

Switching Contexts: Transportability Measures for NLP

This paper explores the topic of transportability, as a sub-area of gene...

Please sign up or login with your details

Forgot password? Click here to reset