Sentence, Phrase, and Triple Annotations to Build a Knowledge Graph of Natural Language Processing Contributions – A Trial Dataset

by   Jennifer D'Souza, et al.

Purpose: The aim of this work is to normalize the NLPCONTRIBUTIONS scheme (henceforward, NLPCONTRIBUTIONGRAPH) to structure, directly from article sentences, the contributions information in Natural Language Processing (NLP) scholarly articles via a two-stage annotation methodology: 1) pilot stage - to define the scheme (described in prior work); and 2) adjudication stage - to normalize the graphing model (the focus of this paper). Design/methodology/approach: We re-annotate, a second time, the contributions-pertinent information across 50 prior-annotated NLP scholarly articles in terms of a data pipeline comprising: contribution-centered sentences, phrases, and triple statements. To this end, specifically, care was taken in the adjudication annotation stage to reduce annotation noise while formulating the guidelines for our proposed novel NLP contributions structuring and graphing scheme. Findings: The application of NLPCONTRIBUTIONGRAPH on the 50 articles resulted finally in a dataset of 900 contribution-focused sentences, 4,702 contribution-information-centered phrases, and 2,980 surface-structured triples. The intra-annotation agreement between the first and second stages, in terms of F1, was 67.92 triple statements indicating that with increased granularity of the information, the annotation decision variance is greater. Practical Implications: We demonstrate NLPCONTRIBUTIONGRAPH data integrated into the Open Research Knowledge Graph (ORKG), a next-generation KG-based digital library with intelligent computations enabled over structured scholarly knowledge, as a viable aid to assist researchers in their day-to-day tasks.


NLPContributions: An Annotation Scheme for Machine Reading of Scholarly Contributions in Natural Language Processing Literature

We describe an annotation initiative to capture the scholarly contributi...

SemEval-2021 Task 11: NLPContributionGraph – Structuring Scholarly NLP Contributions for a Research Knowledge Graph

There is currently a gap between the natural language expression of scho...

KnowGraph@IITK at SemEval-2021 Task 11: Building KnowledgeGraph for NLP Research

Research in Natural Language Processing is making rapid advances, result...

TinyGenius: Intertwining Natural Language Processing with Microtask Crowdsourcing for Scholarly Knowledge Graph Creation

As the number of published scholarly articles grows steadily each year, ...

Knowledge Graph Extraction from Videos

Nearly all existing techniques for automated video annotation (or captio...

Research on multi-dimensional end-to-end phrase recognition algorithm based on background knowledge

At present, the deep end-to-end method based on supervised learning is u...

UIUC_BioNLP at SemEval-2021 Task 11: A Cascade of Neural Models for Structuring Scholarly NLP Contributions

We propose a cascade of neural models that performs sentence classificat...

Please sign up or login with your details

Forgot password? Click here to reset