Multiple Word Embeddings for Increased Diversity of Representation

09/30/2020
by   Brian Lester, et al.
0

Most state-of-the-art models in natural language processing (NLP) are neural models built on top of large, pre-trained, contextual language models that generate representations of words in context and are fine-tuned for the task at hand. The improvements afforded by these "contextual embeddings" come with a high computational cost. In this work, we explore a simple technique that substantially and consistently improves performance over a strong baseline with negligible increase in run time. We concatenate multiple pre-trained embeddings to strengthen our representation of words. We show that this concatenation technique works across many tasks, datasets, and model types. We analyze aspects of pre-trained embedding similarity and vocabulary coverage and find that the representational diversity between different pre-trained embeddings is the driving force of why this technique works. We provide open source implementations of our models in both TensorFlow and PyTorch.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/05/2020

Computationally Efficient NER Taggers with Combined Embeddings and Constrained Decoding

Current State-of-the-Art models in Named Entity Recognition (NER) are ne...
research
06/26/2023

The Art of Embedding Fusion: Optimizing Hate Speech Detection

Hate speech detection is a challenging natural language processing task ...
research
09/26/2022

Towards Parameter-Efficient Integration of Pre-Trained Language Models In Temporal Video Grounding

This paper explores the task of Temporal Video Grounding (TVG) where, gi...
research
10/14/2022

HashFormers: Towards Vocabulary-independent Pre-trained Transformers

Transformer-based pre-trained language models are vocabulary-dependent, ...
research
09/17/2022

Unsupervised Lexical Substitution with Decontextualised Embeddings

We propose a new unsupervised method for lexical substitution using pre-...
research
05/22/2020

Improving Segmentation for Technical Support Problems

Technical support problems are often long and complex. They typically co...
research
09/15/2021

Comparing Text Representations: A Theory-Driven Approach

Much of the progress in contemporary NLP has come from learning represen...

Please sign up or login with your details

Forgot password? Click here to reset