Conformal prediction for text infilling and part-of-speech prediction

11/04/2021
by   Neil Dey, et al.
0

Modern machine learning algorithms are capable of providing remarkably accurate point-predictions; however, questions remain about their statistical reliability. Unlike conventional machine learning methods, conformal prediction algorithms return confidence sets (i.e., set-valued predictions) that correspond to a given significance level. Moreover, these confidence sets are valid in the sense that they guarantee finite sample control over type 1 error probabilities, allowing the practitioner to choose an acceptable error rate. In our paper, we propose inductive conformal prediction (ICP) algorithms for the tasks of text infilling and part-of-speech (POS) prediction for natural language data. We construct new conformal prediction-enhanced bidirectional encoder representations from transformers (BERT) and bidirectional long short-term memory (BiLSTM) algorithms for POS tagging and a new conformal prediction-enhanced BERT algorithm for text infilling. We analyze the performance of the algorithms in simulations using the Brown Corpus, which contains over 57,000 sentences. Our results demonstrate that the ICP algorithms are able to produce valid set-valued predictions that are small enough to be applicable in real-world applications. We also provide a real data example for how our proposed set-valued predictions can improve machine generated audio transcriptions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/21/2015

Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Recurrent Neural Network

Bidirectional Long Short-Term Memory Recurrent Neural Network (BLSTM-RNN...
research
01/09/2020

Binary and Multitask Classification Model for Dutch Anaphora Resolution: Die/Dat Prediction

The correct use of Dutch pronouns 'die' and 'dat' is a stumbling block f...
research
06/28/2017

HTM-MAT: An online prediction software toolbox based on cortical machine learning algorithm

HTM-MAT is a MATLAB based toolbox for implementing cortical learning alg...
research
02/08/2022

Conformal prediction for the design problem

In many real-world deployments of machine learning, we use a prediction ...
research
08/04/2020

Reliable Part-of-Speech Tagging of Historical Corpora through Set-Valued Prediction

Syntactic annotation of corpora in the form of part-of-speech (POS) tags...
research
04/28/2021

Finite-sample Efficient Conformal Prediction

Conformal prediction is a generic methodology for finite-sample valid di...
research
05/16/2020

Conformal Prediction: a Unified Review of Theory and New Challenges

In this work we provide a review of basic ideas and novel developments a...

Please sign up or login with your details

Forgot password? Click here to reset