Group nonnegative matrix factorisation with speaker and session variability compensation for speaker identification

Romain Serizel, Slim Essid, Gael Richard

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This paper presents a feature learning approach for speaker identification that is based on nonnegative matrix factorisation. Recent studies have shown that with such models, the dictionary atoms can represent well the speaker identity. The approaches proposed so far focused only on speaker variability and not on session variability. However, this later point is a crucial aspect in the success of the I-vector approach that is now the state-of-the-art in speaker identification. This paper proposes a method that relies on group nonnegative matrix factorisation and that is inspired by the I-vector training procedure. By doing so the proposed approach intends to capture both the speaker variability and the session variability. Results on a small corpus prove that the proposed approach can be competitive with I-vectors.

Original languageEnglish
Title of host publication2016 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages5470-5474
Number of pages5
ISBN (Electronic)9781479999880
DOIs
Publication statusPublished - 18 May 2016
Externally publishedYes
Event41st IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016 - Shanghai, China
Duration: 20 Mar 201625 Mar 2016

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2016-May
ISSN (Print)1520-6149

Conference

Conference41st IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016
Country/TerritoryChina
CityShanghai
Period20/03/1625/03/16

Keywords

  • Nonnegative matrix factorisation
  • feature learning
  • speaker identification
  • speaker variability
  • spectrogram factorisation

Fingerprint

Dive into the research topics of 'Group nonnegative matrix factorisation with speaker and session variability compensation for speaker identification'. Together they form a unique fingerprint.

Cite this