Correcting Real-Word Spelling Errors: A New Hybrid Approach

Spelling correction is one of the main tasks in the field of Natural Language Processing. Contrary to common spelling errors, real-word errors cannot be detected by conventional spelling correction methods. The real-word correction model proposed by Mays, Damerau and Mercer showed a great performance in different evaluations. In this research, however, a new hybrid approach is proposed which relies on statistical and syntactic knowledge to detect and correct real-word errors. In this model, Constraint Grammar (CG) is used to discriminate among sets of correction candidates in the search space. Mays, Damerau and Mercer's trigram approach is manipulated to estimate the probability of syntactically well-formed correction candidates. The approach proposed here is tested on the Wall Street Journal corpus. The model can prove to be more practical than some other models, such as WordNet-based method of Hirst and Budanitsky and fixed windows size method of Wilcox-O'Hearn and Hirst.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/07/2023

Real-Word Error Correction with Trigrams: Correcting Multiple Errors in a Sentence

Spelling correction is a fundamental task in Text Mining. In this study,...
research
06/22/2021

A Simple and Practical Approach to Improve Misspellings in OCR Text

The focus of our paper is the identification and correction of non-word ...
research
11/21/2016

Statistical Learning for OCR Text Correction

The accuracy of Optical Character Recognition (OCR) is crucial to the su...
research
07/02/2020

Random errors are not politically neutral

Errors are inevitable in the implementation of any complex process. Here...
research
08/20/2022

BSpell: A CNN-blended BERT Based Bengali Spell Checker

Bengali typing is mostly performed using English keyboard and can be hig...
research
11/01/2021

VSEC: Transformer-based Model for Vietnamese Spelling Correction

Spelling error correction is one of topics which have a long history in ...
research
09/15/2017

Transcribing Against Time

We investigate the problem of manually correcting errors from an automat...

Please sign up or login with your details

Forgot password? Click here to reset