Joint Chinese Word Segmentation and Span-based Constituency Parsing

11/03/2022
by   Zhicheng Wang, et al.
0

In constituency parsing, span-based decoding is an important direction. However, for Chinese sentences, because of their linguistic characteristics, it is necessary to utilize other models to perform word segmentation first, which introduces a series of uncertainties and generally leads to errors in the computation of the constituency tree afterward. This work proposes a method for joint Chinese word segmentation and Span-based Constituency Parsing by adding extra labels to individual Chinese characters on the parse trees. Through experiments, the proposed algorithm outperforms the recent models for joint segmentation and constituency parsing on CTB 5.1.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/09/2019

A Unified Model for Joint Chinese Word Segmentation and Dependency Parsing

Chinese word segmentation and dependency parsing are two fundamental tas...
research
03/31/2022

A Character-level Span-based Model for Mandarin Prosodic Structure Prediction

The accuracy of prosodic structure prediction is crucial to the naturaln...
research
12/17/2021

Joint Chinese Word Segmentation and Part-of-speech Tagging via Two-stage Span Labeling

Chinese word segmentation and part-of-speech tagging are necessary tasks...
research
11/06/2018

Fast Neural Chinese Word Segmentation for Long Sentences

Rapidly developed neural models have achieved competitive performance in...
research
11/01/2022

Order-sensitive Neural Constituency Parsing

We propose a novel algorithm that improves on the previous neural span-b...
research
04/22/2019

Tetra-Tagging: Word-Synchronous Parsing with Linear-Time Inference

We present a constituency parsing algorithm that maps from word-aligned ...
research
05/17/2021

Dependency Parsing as MRC-based Span-Span Prediction

Higher-order methods for dependency parsing can partially but not fully ...

Please sign up or login with your details

Forgot password? Click here to reset