Rapid Adaptation of POS Tagging for Domain Specific Uses

10/31/2014
by   John E. Miller, et al.
0

Part-of-speech (POS) tagging is a fundamental component for performing natural language tasks such as parsing, information extraction, and question answering. When POS taggers are trained in one domain and applied in significantly different domains, their performance can degrade dramatically. We present a methodology for rapid adaptation of POS taggers to new domains. Our technique is unsupervised in that a manually annotated corpus for the new domain is not necessary. We use suffix information gathered from large amounts of raw text as well as orthographic information to increase the lexical coverage. We present an experiment in the Biological domain where our POS tagger achieves results comparable to POS taggers specifically trained to this domain.

READ FULL TEXT

page 1

page 2

research
01/10/2020

Machine Learning Approaches for Amharic Parts-of-speech Tagging

Part-of-speech (POS) tagging is considered as one of the basic but neces...
research
10/11/2021

A Review on Part-of-Speech Technologies

Developing an automatic part-of-speech (POS) tagging for any new languag...
research
04/29/2020

A Cross-Genre Ensemble Approach to Robust Reddit Part of Speech Tagging

Part of speech tagging is a fundamental NLP task often regarded as solve...
research
04/03/2017

Combining Lexical and Syntactic Features for Detecting Content-dense Texts in News

Content-dense news report important factual information about an event i...
research
10/12/2017

Adapting general-purpose speech recognition engine output for domain-specific natural language question answering

Speech-based natural language question-answering interfaces to enterpris...
research
05/21/2019

Domain adaptation for part-of-speech tagging of noisy user-generated text

The performance of a Part-of-speech (POS) tagger is highly dependent on ...
research
08/26/2021

SAUCE: Truncated Sparse Document Signature Bit-Vectors for Fast Web-Scale Corpus Expansion

Recent advances in text representation have shown that training on large...

Please sign up or login with your details

Forgot password? Click here to reset