3D-EX : A Unified Dataset of Definitions and Dictionary Examples

08/06/2023
by   Fatemah Almeman, et al.
0

Definitions are a fundamental building block in lexicography, linguistics and computational semantics. In NLP, they have been used for retrofitting word embeddings or augmenting contextual representations in language models. However, lexical resources containing definitions exhibit a wide range of properties, which has implications in the behaviour of models trained and evaluated on them. In this paper, we introduce 3D- EX , a dataset that aims to fill this gap by combining well-known English resources into one centralized knowledge repository in the form of <term, definition, example> triples. 3D- EX is a unified evaluation framework with carefully pre-computed train/validation/test splits to prevent memorization. We report experimental results that suggest that this dataset could be effectively leveraged in downstream NLP tasks. Code and data are available at https://github.com/F-Almeman/3D-EX .

READ FULL TEXT
research
03/11/2019

ETNLP: A Toolkit for Extraction, Evaluation and Visualization of Pre-trained Word Embeddings

In this paper, we introduce a comprehensive toolkit, ETNLP, which can ev...
research
04/02/2015

Learning to Understand Phrases by Embedding the Dictionary

Distributional models that learn rich semantic word representations are ...
research
08/27/2018

Dissecting Contextual Word Embeddings: Architecture and Representation

Contextual word representations derived from pre-trained bidirectional l...
research
05/28/2023

Plug-and-Play Knowledge Injection for Pre-trained Language Models

Injecting external knowledge can improve the performance of pre-trained ...
research
06/03/2019

Global Textual Relation Embedding for Relational Understanding

Pre-trained embeddings such as word embeddings and sentence embeddings a...
research
06/21/2019

Learning Bilingual Word Embeddings Using Lexical Definitions

Bilingual word embeddings, which representlexicons of different language...
research
05/24/2023

Improving Probability-based Prompt Selection Through Unified Evaluation and Analysis

Large Language Models (LLMs) have demonstrated great capabilities in sol...

Please sign up or login with your details

Forgot password? Click here to reset