Passer à la navigation principale Passer à la recherche Passer au contenu principal

O-EENC-SD: Efficient Online End-to-End Neural Clustering for Speaker Diarization

  • Elio Gruttadauria
  • , Mathieu Fontaine
  • , Jonathan Le Roux
  • , Slim Essid
  • Institut Polytechnique de Paris
  • Mitsubishi Electric Research Laboratories

Résultats de recherche: Le chapitre dans un livre, un rapport, une anthologie ou une collectionContribution à une conférenceRevue par des pairs

Résumé

We introduce O-EENC-SD: an end-to-end online speaker diarization system based on EEND-EDA, featuring a novel RNN-based stitching mechanism for online prediction. In particular, we develop a novel centroid refinement decoder whose usefulness is assessed through a rigorous ablation study. Our system provides key advantages over existing methods: a hyperparameter-free solution compared to unsupervised clustering approaches, and a more efficient alternative to current online end-to-end methods, which are computationally costly. We demonstrate that O-EENC-SD is competitive with the state of the art in the two-speaker conversational telephone speech domain, as tested on the CallHome dataset. Our results show that O-EENC-SD provides a great trade-off between DER and complexity, even when working on independent chunks with no overlap, making the system extremely efficient.

langue originaleAnglais
titre2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025 - Proceedings
rédacteurs en chefBhaskar D Rao, Isabel Trancoso, Gaurav Sharma, Neelesh B. Mehta
EditeurInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronique)9798350368741
Les DOIs
étatPublié - 1 janv. 2025
Evénement2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025 - Hyderabad, Inde
Durée: 6 avr. 202511 avr. 2025

Série de publications

NomICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (imprimé)1520-6149

Une conférence

Une conférence2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025
Pays/TerritoireInde
La villeHyderabad
période6/04/2511/04/25

Empreinte digitale

Examiner les sujets de recherche de « O-EENC-SD: Efficient Online End-to-End Neural Clustering for Speaker Diarization ». Ensemble, ils forment une empreinte digitale unique.

Contient cette citation