Speaker diarization using data-driven audio sequencing

Houssemeddine Khemiri, Dijana Petrovska-Delacretaz, Gerard Chollet

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In this paper, a speaker diarization system based on data-driven segmentation is proposed. In addition to the usual segmentation and clustering steps, a new module which detects repeated segments between the same shows broadcasted on different dates is added. This process is achieved by using the ALISP-based audio identification system which segments audio data into pseudo-phonetic units. The ALISP segmentation is then used to identify the similar audio segments in TV and radio shows. The system was evaluated during the ETAPE 2011 evaluation campaign and obtained a Diarization Error Rate - DER of 16.23% which was the best result among seven participants.

Original languageEnglish
Title of host publication2013 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Proceedings
Pages7736-7740
Number of pages5
DOIs
Publication statusPublished - 18 Oct 2013
Externally publishedYes
Event2013 38th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Vancouver, BC, Canada
Duration: 26 May 201331 May 2013

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Conference

Conference2013 38th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013
Country/TerritoryCanada
CityVancouver, BC
Period26/05/1331/05/13

Keywords

  • ALISP units
  • data-driven audio sequencing
  • speaker diarization

Fingerprint

Dive into the research topics of 'Speaker diarization using data-driven audio sequencing'. Together they form a unique fingerprint.

Cite this