On Data Sampling Strategies for Training Neural Network Speech Separation Models

04/14/2023
by   William Ravenscroft, et al.
0

Speech separation remains an important area of multi-speaker signal processing. Deep neural network (DNN) models have attained the best performance on many speech separation benchmarks. Some of these models can take significant time to train and have high memory requirements. Previous work has proposed shortening training examples to address these issues but the impact of this on model performance is not yet well understood. In this work, the impact of applying these training signal length (TSL) limits is analysed for two speech separation models: SepFormer, a transformer model, and Conv-TasNet, a convolutional model. The WJS0-2Mix, WHAMR and Libri2Mix datasets are analysed in terms of signal length distribution and its impact on training efficiency. It is demonstrated that, for specific distributions, applying specific TSL limits results in better performance. This is shown to be mainly due to randomly sampling the start index of the waveforms resulting in more unique examples for training. A SepFormer model trained using a TSL limit of 4.42s and dynamic mixing (DM) is shown to match the best-performing SepFormer model trained with DM and unlimited signal lengths. Furthermore, the 4.42s TSL limit results in a 44

READ FULL TEXT
research
08/09/2022

Recycling an anechoic pre-trained speech separation deep neural network for binaural dereverberation of a single source

Reverberation results in reduced intelligibility for both normal and hea...
research
09/04/2017

Using Optimal Ratio Mask as Training Target for Supervised Speech Separation

Supervised speech separation uses supervised learning algorithms to lear...
research
10/27/2022

Deformable Temporal Convolutional Networks for Monaural Noisy Reverberant Speech Separation

Speech separation models are used for isolating individual speakers in m...
research
08/16/2021

Convolutive Prediction for Monaural Speech Dereverberation and Noisy-Reverberant Speaker Separation

A promising approach for speech dereverberation is based on supervised l...
research
03/25/2022

Embedding Recurrent Layers with Dual-Path Strategy in a Variant of Convolutional Network for Speaker-Independent Speech Separation

Speaker-independent speech separation has achieved remarkable performanc...
research
08/04/2019

Probabilistic Permutation Invariant Training for Speech Separation

Single-microphone, speaker-independent speech separation is normally per...
research
03/15/2023

Speech Signal Improvement Using Causal Generative Diffusion Models

In this paper, we present a causal speech signal improvement system that...

Please sign up or login with your details

Forgot password? Click here to reset