Real-Word Error Correction with Trigrams: Correcting Multiple Errors in a Sentence

Spelling correction is a fundamental task in Text Mining. In this study, we assess the real-word error correction model proposed by Mays, Damerau and Mercer and describe several drawbacks of the model. We propose a new variation which focuses on detecting and correcting multiple real-word errors in a sentence, by manipulating a Probabilistic Context-Free Grammar (PCFG) to discriminate between items in the search space. We test our approach on the Wall Street Journal corpus and show that it outperforms Hirst and Budanitsky's WordNet-based method and Wilcox-O'Hearn, Hirst, and Budanitsky's fixed windows size method.-O'Hearn, Hirst, and Budanitsky's fixed windows size method.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset