Histogram Transform-based Speaker Identification

by   Zhanyu Ma, et al.

A novel text-independent speaker identification (SI) method is proposed. This method uses the Mel-frequency Cepstral coefficients (MFCCs) and the dynamic information among adjacent frames as feature sets to capture speaker's characteristics. In order to utilize dynamic information, we design super-MFCCs features by cascading three neighboring MFCCs frames together. The probability density function (PDF) of these super-MFCCs features is estimated by the recently proposed histogram transform (HT) method, which generates more training data by random transforms to realize the histogram PDF estimation and recedes the commonly occurred discontinuity problem in multivariate histograms computing. Compared to the conventional PDF estimation methods, such as Gaussian mixture models, the HT model shows promising improvement in the SI performance.


Pitch-synchronous DCT features: A pilot study on speaker identification

We propose a new feature, namely, pitchsynchronous discrete cosine trans...

Histogram Meets Topic Model: Density Estimation by Mixture of Histograms

The histogram method is a powerful non-parametric approach for estimatin...

A Novel Minimum Divergence Approach to Robust Speaker Identification

In this work, a novel solution to the speaker identification problem is ...

Experiments on Open-Set Speaker Identification with Discriminatively Trained Neural Networks

This paper presents a study on discriminative artificial neural network ...

Text Independent Speaker Identification System for Access Control

Even human intelligence system fails to offer 100 speeches from a specif...

Blind Extraction of Target Speech Source Guided by Supervised Speaker Identification via X-vectors

This manuscript proposes a novel robust procedure for extraction of a sp...

Please sign up or login with your details

Forgot password? Click here to reset