Computational Induction of Prosodic Structure

12/15/2019
by   Dafydd Gibbon, et al.
0

The present study has two goals relating to the grammar of prosody, understood as the rhythms and melodies of speech. First, an overview is provided of the computable grammatical and phonetic approaches to prosody analysis which use hypothetico-deductive methods and are based on learned hermeneutic intuitions about language. Second, a proposal is presented for an inductive grounding in the physical signal, in which prosodic structure is inferred using a language-independent method from the low-frequency spectrum of the speech signal. The overview includes a discussion of computational aspects of standard generative and post-generative models, and suggestions for reformulating these to form inductive approaches. Also included is a discussion of linguistic phonetic approaches to analysis of annotations (pairs of speech unit labels with time-stamps) of recorded spoken utterances. The proposal introduces the inductive approach of Rhythm Formant Theory (RFT) and the associated Rhythm Formant Analysis (RFA) method are introduced, with the aim of completing a gap in the linguistic hypothetico-deductive cycle by grounding in a language-independent inductive procedure of speech signal analysis. The validity of the method is demonstrated and applied to rhythm patterns in read-aloud Mandarin Chinese, finding differences from English which are related to lexical and grammatical differences between the languages, as well as individual variation. The overall conclusions are (1) that normative language-to-language phonological or phonetic comparisons of rhythm, for example of Mandarin and English, are too simplistic, in view of diverse language-internal factors due to genre and style differences as well as utterance dynamics, and (2) that language-independent empirical grounding of rhythm in the physical signal is called for.

READ FULL TEXT
research
01/25/2017

Learning Word-Like Units from Joint Audio-Visual Analysis

Given a collection of images and spoken audio captions, we present a met...
research
02/25/2022

Learning English with Peppa Pig

Attempts to computationally simulate the acquisition of spoken language ...
research
12/07/2022

Improve Bilingual TTS Using Dynamic Language and Phonology Embedding

In most cases, bilingual TTS needs to handle three types of input script...
research
04/09/2017

Prosody: The Rhythms and Melodies of Speech

The present contribution is a tutorial on selected aspects of prosody, t...
research
04/30/2020

Reinforcement learning of minimalist grammars

Speech-controlled user interfaces facilitate the operation of devices an...
research
06/11/2019

Reinforcement Learning of Minimalist Numeral Grammars

Speech-controlled user interfaces facilitate the operation of devices an...
research
09/03/2019

Quantifying and Correlating Rhythm Formants in Speech

The objective of the present study is exploratory: to introduce and appl...

Please sign up or login with your details

Forgot password? Click here to reset