Reverberant Sound Localization with a Robot Head Based on Direct-Path Relative Transfer Function

12/07/2020
by   Xiaofei Li, et al.
0

This paper addresses the problem of sound-source localization (SSL) with a robot head, which remains a challenge in real-world environments. In particular we are interested in locating speech sources, as they are of high interest for human-robot interaction. The microphone-pair response corresponding to the direct-path sound propagation is a function of the source direction. In practice, this response is contaminated by noise and reverberations. The direct-path relative transfer function (DP-RTF) is defined as the ratio between the direct-path acoustic transfer function (ATF) of the two microphones, and it is an important feature for SSL. We propose a method to estimate the DP-RTF from noisy and reverberant signals in the short-time Fourier transform (STFT) domain. First, the convolutive transfer function (CTF) approximation is adopted to accurately represent the impulse response of the microphone array, and the first coefficient of the CTF is mainly composed of the direct-path ATF. At each frequency, the frame-wise speech auto- and cross-power spectral density (PSD) are obtained by spectral subtraction. Then a set of linear equations is constructed by the speech auto- and cross-PSD of multiple frames, in which the DP-RTF is an unknown variable, and is estimated by solving the equations. Finally, the estimated DP-RTFs are concatenated across frequencies and used as a feature vector for SSL. Experiments with a robot, placed in various reverberant environments, show that the proposed method outperforms two state-of-the-art methods.

READ FULL TEXT

page 1

page 5

page 7

research
02/16/2022

Learning Deep Direct-Path Relative Transfer Function for Binaural Sound Source Localization

Direct-path relative transfer function (DP-RTF) refers to the ratio betw...
research
09/28/2018

Online Localization and Tracking of Multiple Moving Speakers in Reverberant Environments

This paper addresses the problem of online multiple-speaker localization...
research
12/12/2018

Description of algorithms for Ben-Gurion University Submission to the LOCATA challenge

This paper summarizes the methods used to localize the sources recorded ...
research
02/16/2022

SRP-DNN: Learning Direct-Path Phase Difference for Multiple Moving Sound Source Localization

Multiple moving sound source localization in real-world scenarios remain...
research
09/09/2021

Directional MCLP Analysis and Reconstruction for Spatial Speech Communication

Spatial speech communication, i.e., the reconstruction of spoken signal ...
research
11/21/2017

Reflection-Aware Sound Source Localization

We present a novel, reflection-aware method for 3D sound localization in...
research
11/21/2017

Multichannel Speech Separation and Enhancement Using the Convolutive Transfer Function

This paper addresses the problem of speech separation and enhancement fr...

Please sign up or login with your details

Forgot password? Click here to reset