Hidden Markov Chains, Entropic Forward-Backward, and Part-Of-Speech Tagging

05/21/2020
by   Elie Azeraf, et al.
0

The ability to take into account the characteristics - also called features - of observations is essential in Natural Language Processing (NLP) problems. Hidden Markov Chain (HMC) model associated with classic Forward-Backward probabilities cannot handle arbitrary features like prefixes or suffixes of any size, except with an independence condition. For twenty years, this default has encouraged the development of other sequential models, starting with the Maximum Entropy Markov Model (MEMM), which elegantly integrates arbitrary features. More generally, it led to neglect HMC for NLP. In this paper, we show that the problem is not due to HMC itself, but to the way its restoration algorithms are computed. We present a new way of computing HMC based restorations using original Entropic Forward and Entropic Backward (EFB) probabilities. Our method allows taking into account features in the HMC framework in the same way as in the MEMM framework. We illustrate the efficiency of HMC using EFB in Part-Of-Speech Tagging, showing its superiority over MEMM based restoration. We also specify, as a perspective, how HMCs with EFB might appear as an alternative to Recurrent Neural Networks to treat sequential data with a deep architecture.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/17/2021

Introducing the Hidden Neural Markov Chain framework

Nowadays, neural network models achieve state-of-the-art results in many...
research
03/09/2017

Turkish PoS Tagging by Reducing Sparsity with Morpheme Tags in Small Datasets

Sparsity is one of the major problems in natural language processing. Th...
research
07/21/2022

Bayesian Recurrent Units and the Forward-Backward Algorithm

Using Bayes's theorem, we derive a unit-wise recurrence as well as a bac...
research
12/12/2012

Reduction of Maximum Entropy Models to Hidden Markov Models

We show that maximum entropy (maxent) models can be modeled with certain...
research
04/05/2019

Diversified Hidden Markov Models for Sequential Labeling

Labeling of sequential data is a prevalent meta-problem for a wide range...
research
02/17/2021

Highly Fast Text Segmentation With Pairwise Markov Chains

Natural Language Processing (NLP) models' current trend consists of usin...
research
05/27/2022

Conditional particle filters with bridge backward sampling

The performance of the conditional particle filter (CPF) with backward s...

Please sign up or login with your details

Forgot password? Click here to reset