Multichannel audio source separation: Variational inference of time-frequency sources from time-domain observations

Simon Leglaive, Roland Badeau, Gael Richard

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

A great number of methods for multichannel audio source separation are based on probabilistic approaches in which the sources are modeled as latent random variables in a Time-Frequency (TF) domain. For reverberant mixtures, it is common to approximate the time-domain convolutive mixing process as being instantaneous in the short-term Fourier transform domain, under a short mixing filters assumption. The TF latent sources are then inferred from the TF mixture observations. In this paper we propose to infer the TF latent sources from the time-domain observations. This approach allows us to exactly model the convolutive mixing process. The inference procedure relies on a variational expectation-maximization algorithm. In significant reverberation conditions, our approach leads to a signal-to-distortion ratio improvement of 5.5 dB compared with the usual TF approximation of the convolutive mixing process.

Original languageEnglish
Title of host publication2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages26-30
Number of pages5
ISBN (Electronic)9781509041176
DOIs
Publication statusPublished - 16 Jun 2017
Externally publishedYes
Event2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017 - New Orleans, United States
Duration: 5 Mar 20179 Mar 2017

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Conference

Conference2017 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2017
Country/TerritoryUnited States
CityNew Orleans
Period5/03/179/03/17

Keywords

  • Multichannel audio source separation
  • nonnegative matrix factorization
  • time-domain convolutive model
  • time-frequency source model
  • variational EM algorithm

Fingerprint

Dive into the research topics of 'Multichannel audio source separation: Variational inference of time-frequency sources from time-domain observations'. Together they form a unique fingerprint.

Cite this