Analyzing the Representational Geometry of Acoustic Word Embeddings

01/08/2023
by   Badr M. Abdullah, et al.
0

Acoustic word embeddings (AWEs) are vector representations such that different acoustic exemplars of the same word are projected nearby in the embedding space. In addition to their use in speech technology applications such as spoken term discovery and keyword spotting, AWE models have been adopted as models of spoken-word processing in several cognitively motivated studies and have been shown to exhibit human-like performance in some auditory processing tasks. Nevertheless, the representational geometry of AWEs remains an under-explored topic that has not been studied in the literature. In this paper, we take a closer analytical look at AWEs learned from English speech and study how the choice of the learning objective and the architecture shapes their representational profile. To this end, we employ a set of analytic techniques from machine learning and neuroscience in three different analyses: embedding space uniformity, word discriminability, and representational consistency. Our main findings highlight the prominent role of the learning objective on shaping the representation profile compared to the model architecture.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/14/2022

Integrating Form and Meaning: A Multi-Task Learning Model for Acoustic Word Embeddings

Models of acoustic word embeddings (AWEs) learn to map variable-length s...
research
06/16/2021

Do Acoustic Word Embeddings Capture Phonological Similarity? An Empirical Study

Several variants of deep neural networks have been successfully employed...
research
02/18/2019

Learned In Speech Recognition: Contextual Acoustic Word Embeddings

End-to-end acoustic-to-word speech recognition models have recently gain...
research
08/28/2023

Neural approaches to spoken content embedding

Comparing spoken segments is a central operation to speech processing. T...
research
10/21/2022

Spoken Term Detection and Relevance Score Estimation using Dot-Product of Pronunciation Embeddings

The paper describes a novel approach to Spoken Term Detection (STD) in l...
research
03/07/2021

CNN-based Spoken Term Detection and Localization without Dynamic Programming

In this paper, we propose a spoken term detection algorithm for simultan...
research
09/21/2021

How Familiar Does That Sound? Cross-Lingual Representational Similarity Analysis of Acoustic Word Embeddings

How do neural networks "perceive" speech sounds from unknown languages? ...

Please sign up or login with your details

Forgot password? Click here to reset