TY - GEN
T1 - Physically Informed Spatial Regularization for Sound Event Localization and Detection
AU - Liu, Haocheng
AU - Di Carlo, Diego
AU - Arie Nugraha, Aditya
AU - Yoshii, Kazuyoshi
AU - Richard, Gaël
AU - Fontaine, Mathieu
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025/1/1
Y1 - 2025/1/1
N2 - Building Sound Event Localization and Detection (SELD) models that are robust to diverse acoustic environments remains one of the major challenges in multichannel signal processing, as reflections and reverberation can significantly confuse both the source direction and event detection. Introducing priors such as microphone geometry or room impulse response (RIR) into the model has proven effective in addressing this issue. Existing methods typically incorporate such priors in a deterministic way, often through data augmentation to enlarge data diversity. However, the uncertainty arising from the complex nature of audio acoustics remains largely underexplored in the SELD literature and naturally call for incorporating a stochastic modeling of acoustic prior. In this paper, we propose regularizing deep learning based SELD models with a physically constructed spatial covariance matrix (SCM) based on the estimated direction of arrival (DOA) and sound event detection (SED).
AB - Building Sound Event Localization and Detection (SELD) models that are robust to diverse acoustic environments remains one of the major challenges in multichannel signal processing, as reflections and reverberation can significantly confuse both the source direction and event detection. Introducing priors such as microphone geometry or room impulse response (RIR) into the model has proven effective in addressing this issue. Existing methods typically incorporate such priors in a deterministic way, often through data augmentation to enlarge data diversity. However, the uncertainty arising from the complex nature of audio acoustics remains largely underexplored in the SELD literature and naturally call for incorporating a stochastic modeling of acoustic prior. In this paper, we propose regularizing deep learning based SELD models with a physically constructed spatial covariance matrix (SCM) based on the estimated direction of arrival (DOA) and sound event detection (SED).
UR - https://www.scopus.com/pages/publications/105026953923
U2 - 10.1109/WASPAA66052.2025.11230919
DO - 10.1109/WASPAA66052.2025.11230919
M3 - Conference contribution
AN - SCOPUS:105026953923
T3 - IEEE Workshop on Applications of Signal Processing to Audio and Acoustics
BT - Proceedings of the 2025 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2025
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2025 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2025
Y2 - 12 October 2025 through 15 October 2025
ER -