Multi-channel Speech Separation Using Deep Embedding Model with Multilayer Bootstrap Networks

10/24/2019
by   Ziye Yang, et al.
0

Recently, deep clustering (DPCL) based speaker-independent speech separation has drawn much attention, since it needs little speaker prior information. However, it still has much room of improvement, particularly in reverberant environments. If the training and test environments mismatch which is a common case, the embedding vectors produced by DPCL may contain much noise and many small variations. To deal with the problem, we propose a variant of DPCL, named DPCL++, by applying a recent unsupervised deep learning method—multilayer bootstrap networks(MBN)—to further reduce the noise and small variations of the embedding vectors in an unsupervised way in the test stage, which fascinates k-means to produce a good result. MBN builds a gradually narrowed network from bottom-up via a stack of k-centroids clustering ensembles, where the k-centroids clusterings are trained independently by random sampling and one-nearest-neighbor optimization. To further improve the robustness of DPCL++ in reverberant environments, we take spatial features as part of its input. Experimental results demonstrate the effectiveness of the proposed method.

READ FULL TEXT
research
02/05/2020

Spatial and spectral deep attention fusion for multi-channel speech separation using deep embedding features

Multi-channel deep clustering (MDC) has acquired a good performance for ...
research
08/05/2014

Multilayer bootstrap networks

Multilayer bootstrap network builds a gradually narrowed multilayer nonl...
research
01/13/2021

F3SNet: A Four-Step Strategy for QIM Steganalysis of Compressed Speech Based on Hierarchical Attention Network

Traditional machine learning-based steganalysis methods on compressed sp...
research
03/22/2015

Unsupervised model compression for multilayer bootstrap networks

Recently, multilayer bootstrap network (MBN) has demonstrated promising ...
research
04/06/2020

Simultaneous Denoising and Dereverberation Using Deep Embedding Features

Monaural speech dereverberation is a very challenging task because no sp...
research
10/24/2019

Deep topic modeling by multilayer bootstrap network and lasso

Topic modeling is widely studied for the dimension reduction and analysi...
research
12/19/2019

Practical applicability of deep neural networks for overlapping speaker separation

This paper examines the applicability in realistic scenarios of two deep...

Please sign up or login with your details

Forgot password? Click here to reset