IR-GAN: Room Impulse Response Generator for Speech Augmentation
We present a Generative Adversarial Network (GAN) based room impulse response generator for generating realistic synthetic room impulse responses. Our proposed generator can create synthetic room impulse responses by parametrically controlling the acoustic features captured in real-world room impulse responses. Our GAN-based room impulse response generator (IR-GAN) is capable of improving far-field automatic speech recognition in environments not known during training. We create far-field speech training set by augmenting our synthesized room impulse responses with clean LibriSpeech dataset. We evaluate the quality of our room impulse responses on the real-world LibriSpeech test set created using real impulse responses from BUT ReverbDB and AIR datasets. Furthermore, we combine our synthetic data with synthetic impulse responses generated using acoustic simulators, and this combination can reduce the word error rate by up to 14.3
READ FULL TEXT