Contrastive prediction strategies for unsupervised segmentation and categorization of phonemes and words

10/29/2021
by   Santiago Cuervo, et al.
5

We investigate the performance on phoneme categorization and phoneme and word segmentation of several self-supervised learning (SSL) methods based on Contrastive Predictive Coding (CPC). Our experiments show that with the existing algorithms there is a trade off between categorization and segmentation performance. We investigate the source of this conflict and conclude that the use of context building networks, albeit necessary for superior performance on categorization tasks, harms segmentation performance by causing a temporal shift on the learned representations. Aiming to bridge this gap, we take inspiration from the leading approach on segmentation, which simultaneously models the speech signal at the frame and phoneme level, and incorporate multi-level modelling into Aligned CPC (ACPC), a variation of CPC which exhibits the best performance on categorization tasks. Our multi-level ACPC (mACPC) improves in all categorization metrics and achieves state-of-the-art performance in word segmentation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/03/2021

Segmental Contrastive Predictive Coding for Unsupervised Word Segmentation

Automatic detection of phoneme or word-like units is one of the core obj...
research
10/05/2021

Unsupervised Speech Segmentation and Variable Rate Representation Learning using Segmental Contrastive Predictive Coding

Typically, unsupervised segmentation of speech into the phone and word-l...
research
03/03/2023

Generalized Semantic Segmentation by Self-Supervised Source Domain Projection and Multi-Level Contrastive Learning

Deep networks trained on the source domain show degraded performance whe...
research
02/24/2022

Word Segmentation on Discovered Phone Units with Dynamic Programming and Self-Supervised Scoring

Recent work on unsupervised speech segmentation has used self-supervised...
research
11/25/2022

Ladder Siamese Network: a Method and Insights for Multi-level Self-Supervised Learning

Siamese-network-based self-supervised learning (SSL) suffers from slow c...
research
10/07/2022

SAICL: Student Modelling with Interaction-level Auxiliary Contrastive Tasks for Knowledge Tracing and Dropout Prediction

Knowledge tracing and dropout prediction are crucial for online educatio...
research
06/27/2023

Can Pretrained Language Models Derive Correct Semantics from Corrupt Subwords under Noise?

For Pretrained Language Models (PLMs), their susceptibility to noise has...

Please sign up or login with your details

Forgot password? Click here to reset