Sound texture synthesis using convolutional neural networks

05/09/2019
by   Hugo Caracalla, et al.
0

The following article introduces a new parametric synthesis algorithm for sound textures inspired by existing methods used for visual textures. Using a 2D Convolutional Neural Network (CNN), a sound signal is modified until the temporal cross-correlations of the feature maps of its log-spectrogram resemble those of a target texture. We show that the resulting synthesized sound signal is both different from the original and of high quality, while being able to reproduce singular events appearing in the original. This process is performed in the time domain, discarding the harmful phase recovery step which usually concludes synthesis performed in the time-frequency domain. It is also straightforward and flexible, as it does not require any fine tuning between several losses when synthesizing diverse sound textures. A way of extending the synthesis in order to produce a sound of any length is also presented, after which synthesized spectrograms and sound signals are showcased. We also discuss on the choice of CNN, on border effects in our synthesized signals and on possible ways of modifying the algorithm in order to improve its current long computation time.

READ FULL TEXT

page 5

page 6

research
10/21/2019

Sound texture synthesis using RI spectrograms

This article introduces a new parametric synthesis method for sound text...
research
02/11/2021

Onoma-to-wave: Environmental sound synthesis from onomatopoeic words

In this paper, we propose a new framework for environmental sound synthe...
research
02/22/2017

Synthesising Dynamic Textures using Convolutional Neural Networks

Here we present a parametric model for dynamic textures. The model is ba...
research
06/20/2018

Synthesizing Diverse, High-Quality Audio Textures

Texture synthesis techniques based on matching the Gram matrix of featur...
research
02/21/2020

AutoFoley: Artificial Synthesis of Synchronized Sound Tracks for Silent Videos with Deep Learning

In movie productions, the Foley Artist is responsible for creating an ov...
research
01/07/2022

A sinusoidal signal reconstruction method for the inversion of the mel-spectrogram

The synthesis of sound via deep learning methods has recently received m...
research
12/17/2018

Quaternion Convolutional Neural Networks for Detection and Localization of 3D Sound Events

Learning from data in the quaternion domain enables us to exploit intern...

Please sign up or login with your details

Forgot password? Click here to reset