Learning to Embed Words in Context for Syntactic Tasks

06/09/2017
by   Lifu Tu, et al.
0

We present models for embedding words in the context of surrounding words. Such models, which we refer to as token embeddings, represent the characteristics of a word that are specific to a given context, such as word sense, syntactic category, and semantic role. We explore simple, efficient token embedding models based on standard neural network architectures. We learn token embeddings on a large amount of unannotated text and evaluate them as features for part-of-speech taggers and dependency parsers trained on much smaller amounts of annotated data. We find that predictors endowed with token embeddings consistently outperform baseline predictors across a range of context window and training set sizes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/08/2017

Ontology-Aware Token Embeddings for Prepositional Phrase Attachment

Type-level word embeddings use the same set of parameters to represent a...
research
03/30/2021

Representing ELMo embeddings as two-dimensional text online

We describe a new addition to the WebVectors toolkit which is used to se...
research
11/01/2019

Kernelized Bayesian Softmax for Text Generation

Neural models for text generation require a softmax layer with proper to...
research
07/14/2020

Using Holographically Compressed Embeddings in Question Answering

Word vector representations are central to deep learning natural languag...
research
09/12/2017

Hash Embeddings for Efficient Word Representations

We present hash embeddings, an efficient method for representing words i...
research
08/02/2022

Lost in Space Marking

We look at a decision taken early in training a subword tokenizer, namel...
research
09/13/2023

Auto-Regressive Next-Token Predictors are Universal Learners

Large language models display remarkable capabilities in logical and mat...

Please sign up or login with your details

Forgot password? Click here to reset