Mapping Process for the Task: Wikidata Statements to Text as Wikipedia Sentences

10/23/2022
by   Hoang Thang Ta, et al.
0

Acknowledged as one of the most successful online cooperative projects in human society, Wikipedia has obtained rapid growth in recent years and desires continuously to expand content and disseminate knowledge values for everyone globally. The shortage of volunteers brings to Wikipedia many issues, including developing content for over 300 languages at the present. Therefore, the benefit that machines can automatically generate content to reduce human efforts on Wikipedia language projects could be considerable. In this paper, we propose our mapping process for the task of converting Wikidata statements to natural language text (WS2T) for Wikipedia projects at the sentence level. The main step is to organize statements, represented as a group of quadruples and triples, and then to map them to corresponding sentences in English Wikipedia. We evaluate the output corpus in various aspects: sentence structure analysis, noise filtering, and relationships between sentence components based on word embedding models. The results are helpful not only for the data-to-text generation task but also for other relevant works in the field.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/13/2019

Towards Content Transfer through Grounded Text Generation

Recent work in neural generation has attracted significant interest in c...
research
04/07/2017

A Constrained Sequence-to-Sequence Neural Model for Sentence Simplification

Sentence simplification reduces semantic complexity to benefit people wi...
research
06/11/2019

StRE: Self Attentive Edit Quality Prediction in Wikipedia

Wikipedia can easily be justified as a behemoth, considering the sheer v...
research
04/08/2020

Architecture for a multilingual Wikipedia

Wikipedia's vision is a world in which everyone can share in the sum of ...
research
06/30/2021

A preliminary approach to knowledge integrity risk assessment in Wikipedia projects

Wikipedia is one of the main repositories of free knowledge available to...
research
08/01/2023

CoSMo: A constructor specification language for Abstract Wikipedia's content selection process

Representing snippets of information abstractly is a task that needs to ...
research
09/30/2019

Automatic Fact-guided Sentence Modification

Online encyclopediae like Wikipedia contain large amounts of text that n...

Please sign up or login with your details

Forgot password? Click here to reset