Information Extraction based on Named Entity for Tourism Corpus

by   Chantana Chantrapornchai, et al.

Tourism information is scattered around nowadays. To search for the information, it is usually time consuming to browse through the results from search engine, select and view the details of each accommodation. In this paper, we present a methodology to extract particular information from full text returned from the search engine to facilitate the users. Then, the users can specifically look to the desired relevant information. The approach can be used for the same task in other domains. The main steps are 1) building training data and 2) building recognition model. First, the tourism data is gathered and the vocabularies are built. The raw corpus is used to train for creating vocabulary embedding. Also, it is used for creating annotated data. The process of creating named entity annotation is presented. Then, the recognition model of a given entity type can be built. From the experiments, given hotel description, the model can extract the desired entity,i.e, name, location, facility. The extracted data can further be stored as a structured information, e.g., in the ontology format, for future querying and inference. The model for automatic named entity identification, based on machine learning, yields the error ranging 8


page 1

page 2

page 3

page 4


Introducing RONEC -- the Romanian Named Entity Corpus

We present RONEC - the Named Entity Corpus for the Romanian language. Th...

KazNERD: Kazakh Named Entity Recognition Dataset

We present the development of a dataset for Kazakh named entity recognit...

#MeTooMaastricht: Building a chatbot to assist survivors of sexual harassment

Inspired by the recent social movement of #MeToo, we are building a chat...

Augmented Understanding and Automated Adaptation of Curation Rules

Over the past years, there has been many efforts to curate and increase ...

Learning A Unified Named Entity Tagger From Multiple Partially Annotated Corpora For Efficient Adaptation

Named entity recognition (NER) identifies typed entity mentions in raw t...

AiCEF: An AI-assisted Cyber Exercise Content Generation Framework Using Named Entity Recognition

Content generation that is both relevant and up to date with the current...

"The Michael Jordan of Greatness": Extracting Vossian Antonomasia from Two Decades of the New York Times, 1987-2007

Vossian Antonomasia is a prolific stylistic device, in use since antiqui...

Please sign up or login with your details

Forgot password? Click here to reset