Depthwise Separable Convolutions Versus Recurrent Neural Networks for Monaural Singing Voice Separation

07/06/2020
by   Pyry Pyykkönen, et al.
0

Recent approaches for music source separation are almost exclusively based on deep neural networks, mostly employing recurrent neural networks (RNNs). Although RNNs are in many cases superior than other types of deep neural networks for sequence processing, they are known to have specific difficulties in training and parallelization, especially for the typically long sequences encountered in music source separation. In this paper we present a use-case of replacing RNNs with depth-wise separable (DWS) convolutions, which are a lightweight and faster variant of the typical convolutions. We focus on singing voice separation, employing an RNN architecture, and we replace the RNNs with DWS convolutions (DWS-CNNs). We conduct an ablation study and examine the effect of the number of channels and layers of DWS-CNNs on the source separation performance, by utilizing the standard metrics of signal-to-artifacts, signal-to-interference, and signal-to-distortion ratio. Our results show that by replacing RNNs with DWS-CNNs yields an improvement of 1.20, 0.06, 0.37 dB, respectively, while using only 20.57 parameters of the RNN architecture.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/06/2020

Source Separation and Depthwise Separable Convolutions for Computer Audition

Given recent advances in deep music source separation, we propose a feat...
research
02/02/2020

Sound Event Detection with Depthwise Separable and Dilated Convolutions

State-of-the-art sound event detection (SED) methods usually employ a se...
research
09/15/2023

Music Source Separation Based on a Lightweight Deep Learning Framework (DTTNET: DUAL-PATH TFC-TDF UNET)

Music source separation (MSS) aims to extract 'vocals', 'drums', 'bass' ...
research
10/25/2020

Attention is All You Need in Speech Separation

Recurrent Neural Networks (RNNs) have long been the dominant architectur...
research
08/23/2019

Incremental Binarization On Recurrent Neural Networks For Single-Channel Source Separation

This paper proposes a Bitwise Gated Recurrent Unit (BGRU) network for th...
research
09/05/2023

Music Source Separation with Band-Split RoPE Transformer

Music source separation (MSS) aims to separate a music recording into mu...
research
06/04/2019

Dilated Convolution with Dilated GRU for Music Source Separation

Stacked dilated convolutions used in Wavenet have been shown effective f...

Please sign up or login with your details

Forgot password? Click here to reset