Skip to main navigation Skip to search Skip to main content

O-EENC-SD: Efficient Online End-to-End Neural Clustering for Speaker Diarization

  • Elio Gruttadauria
  • , Mathieu Fontaine
  • , Jonathan Le Roux
  • , Slim Essid
  • Institut Polytechnique de Paris
  • Mitsubishi Electric Research Laboratories

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We introduce O-EENC-SD: an end-to-end online speaker diarization system based on EEND-EDA, featuring a novel RNN-based stitching mechanism for online prediction. In particular, we develop a novel centroid refinement decoder whose usefulness is assessed through a rigorous ablation study. Our system provides key advantages over existing methods: a hyperparameter-free solution compared to unsupervised clustering approaches, and a more efficient alternative to current online end-to-end methods, which are computationally costly. We demonstrate that O-EENC-SD is competitive with the state of the art in the two-speaker conversational telephone speech domain, as tested on the CallHome dataset. Our results show that O-EENC-SD provides a great trade-off between DER and complexity, even when working on independent chunks with no overlap, making the system extremely efficient.

Original languageEnglish
Title of host publication2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025 - Proceedings
EditorsBhaskar D Rao, Isabel Trancoso, Gaurav Sharma, Neelesh B. Mehta
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350368741
DOIs
Publication statusPublished - 1 Jan 2025
Event2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025 - Hyderabad, India
Duration: 6 Apr 202511 Apr 2025

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Conference

Conference2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025
Country/TerritoryIndia
CityHyderabad
Period6/04/2511/04/25

Keywords

  • CallHome
  • EEND-EDA
  • Online speaker diarization
  • conversational telephone speech (CTS)

Fingerprint

Dive into the research topics of 'O-EENC-SD: Efficient Online End-to-End Neural Clustering for Speaker Diarization'. Together they form a unique fingerprint.

Cite this