TutorialBank: A Manually-Collected Corpus for Prerequisite Chains, Survey Extraction and Resource Recommendation

05/11/2018
by   Alexander R. Fabbri, et al.
0

The field of Natural Language Processing (NLP) is growing rapidly, with new research published daily along with an abundance of tutorials, codebases and other online resources. In order to learn this dynamic field or stay up-to-date on the latest research, students as well as educators and researchers must constantly sift through multiple sources to find valuable, relevant information. To address this situation, we introduce TutorialBank, a new, publicly available dataset which aims to facilitate NLP education and research. We have manually collected and categorized over 6,300 resources on NLP as well as the related fields of Artificial Intelligence (AI), Machine Learning (ML) and Information Retrieval (IR). Our dataset is notably the largest manually-picked corpus of resources intended for NLP education which does not include only academic papers. Additionally, we have created both a search engine and a command-line tool for the resources and have annotated the corpus to include lists of research topics, relevant resources for each topic, prerequisite relations among topics, relevant sub-parts of individual resources, among other annotations. We are releasing the dataset and present several avenues for further research.

READ FULL TEXT
research
11/26/2018

What Should I Learn First: Introducing LectureBank for NLP Education and Prerequisite Chain Learning

Recent years have witnessed the rising popularity of Natural Language Pr...
research
12/01/2022

SOLD: Sinhala Offensive Language Dataset

The widespread of offensive content online, such as hate speech and cybe...
research
07/31/2023

NLLG Quarterly arXiv Report 06/23: What are the most influential current AI Papers?

The rapid growth of information in the field of Generative Artificial In...
research
01/25/2018

Etymo: A New Discovery Engine for AI Research

We present Etymo (https://etymo.io), a discovery engine to facilitate ar...
research
12/16/2021

CLICKER: A Computational LInguistics Classification Scheme for Educational Resources

A classification scheme of a scientific subject gives an overview of its...
research
06/20/2018

TxPI-u: A Resource for Personality Identification of Undergraduates

Resources such as labeled corpora are necessary to train automatic model...
research
02/06/2020

Intelligent Arxiv: Sort daily papers by learning users topics preference

Current daily paper releases are becoming increasingly large and areas o...

Please sign up or login with your details

Forgot password? Click here to reset