WIKITIDE: A Wikipedia-Based Timestamped Definition Pairs Dataset

08/07/2023
by   Hsuvas Borkakoty, et al.
0

A fundamental challenge in the current NLP context, dominated by language models, comes from the inflexibility of current architectures to 'learn' new information. While model-centric solutions like continual learning or parameter-efficient fine tuning are available, the question still remains of how to reliably identify changes in language or in the world. In this paper, we propose WikiTiDe, a dataset derived from pairs of timestamped definitions extracted from Wikipedia. We argue that such resource can be helpful for accelerating diachronic NLP, specifically, for training models able to scan knowledge resources for core updates concerning a concept, an event, or a named entity. Our proposed end-to-end method is fully automatic, and leverages a bootstrapping algorithm for gradually creating a high-quality dataset. Our results suggest that bootstrapping the seed version of WikiTiDe leads to better fine-tuned models. We also leverage fine-tuned models in a number of downstream tasks, showing promising results with respect to competitive baselines.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/04/2021

CoreLM: Coreference-aware Language Model Fine-Tuning

Language Models are the underpin of all modern Natural Language Processi...
research
02/13/2023

Task-Specific Skill Localization in Fine-tuned Language Models

Pre-trained language models can be fine-tuned to solve diverse NLP tasks...
research
12/20/2022

Transformers Go for the LOLs: Generating (Humourous) Titles from Scientific Abstracts End-to-End

We consider the end-to-end abstract-to-title generation problem, explori...
research
05/12/2023

Continual Learning for End-to-End ASR by Averaging Domain Experts

Continual learning for end-to-end automatic speech recognition has to co...
research
08/31/2021

How Does Adversarial Fine-Tuning Benefit BERT?

Adversarial training (AT) is one of the most reliable methods for defend...
research
07/01/2021

Improving Human Motion Prediction Through Continual Learning

Human motion prediction is an essential component for enabling closer hu...
research
06/03/2023

Extending an Event-type Ontology: Adding Verbs and Classes Using Fine-tuned LLMs Suggestions

In this project, we have investigated the use of advanced machine learni...

Please sign up or login with your details

Forgot password? Click here to reset