Deep Learning Based Phase Reconstruction for Speaker Separation: A Trigonometric Perspective

11/22/2018
by   Zhong-Qiu Wang, et al.
0

This study investigates phase reconstruction for deep learning based monaural talker-independent speaker separation in the short-time Fourier transform (STFT) domain. The key observation is that, for a mixture of two sources, with their magnitudes accurately estimated and under a geometric constraint, the absolute phase difference between each source and the mixture can be uniquely determined; in addition, the source phases at each time-frequency (T-F) unit can be narrowed down to only two candidates. To pick the right candidate, we propose three algorithms based on iterative phase reconstruction, group delay estimation, and phase-difference sign prediction. State-of-the-art results are obtained on the publicly available wsj0-2mix and 3mix corpus.

READ FULL TEXT
research
04/26/2018

End-to-End Speech Separation with Unfolded Iterative Phase Reconstruction

This paper proposes an end-to-end approach for single-channel speaker-in...
research
07/30/2018

Harmonic-Percussive Source Separation with Deep Neural Networks and Phase Recovery

Harmonic/percussive source separation (HPSS) consists in separating the ...
research
10/02/2018

Phasebook and Friends: Leveraging Discrete Representations for Source Separation

Deep learning based speech enhancement and source separation systems hav...
research
07/12/2017

Speaker-independent Speech Separation with Deep Attractor Network

Despite the recent success of deep learning for many speech processing t...
research
11/12/2022

Online Phase Reconstruction via DNN-based Phase Differences Estimation

This paper presents a two-stage online phase reconstruction framework us...
research
05/11/2022

Beyond Griffin-Lim: Improved Iterative Phase Retrieval for Speech

Phase retrieval is a problem encountered not only in speech and audio pr...
research
12/19/2019

Practical applicability of deep neural networks for overlapping speaker separation

This paper examines the applicability in realistic scenarios of two deep...

Please sign up or login with your details

Forgot password? Click here to reset