Treebank Embedding Vectors for Out-of-domain Dependency Parsing

05/02/2020
by   Joachim Wagner, et al.
0

A recent advance in monolingual dependency parsing is the idea of a treebank embedding vector, which allows all treebanks for a particular language to be used as training data while at the same time allowing the model to prefer training data from one treebank over others and to select the preferred treebank at test time. We build on this idea by 1) introducing a method to predict a treebank vector for sentences that do not come from a treebank used in training, and 2) exploring what happens when we move away from predefined treebank embedding vectors during test time and instead devise tailored interpolations. We show that 1) there are interpolated vectors that are superior to the predefined ones, and 2) treebank vectors can be predicted with sufficient accuracy, for nine out of ten test languages, to match the performance of an oracle approach that knows the most suitable predefined treebank embedding for the test set.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/29/2019

Test-Time Training for Out-of-Distribution Generalization

We introduce a general approach, called test-time training, for improvin...
research
10/04/2022

Mixup for Test-Time Training

Test-time training provides a new approach solving the problem of domain...
research
10/17/2019

Cross-lingual Parsing with Polyglot Training and Multi-treebank Learning: A Faroese Case Study

Cross-lingual dependency parsing involves transferring syntactic knowled...
research
09/10/2021

Genre as Weak Supervision for Cross-lingual Dependency Parsing

Recent work has shown that monolingual masked language models learn to r...
research
07/30/2015

One model, two languages: training bilingual parsers with harmonized treebanks

We introduce an approach to train lexicalized parsers using bilingual co...
research
05/08/2021

Zero-Shot Personalized Speech Enhancement through Speaker-Informed Model Selection

This paper presents a novel zero-shot learning approach towards personal...
research
06/27/2022

Center-Embedding and Constituency in the Brain and a New Characterization of Context-Free Languages

A computational system implemented exclusively through the spiking of ne...

Please sign up or login with your details

Forgot password? Click here to reset