Text-prompted speaker verification experiments with phoneme specific MLPs

Dijana Petrovska Delacretaz, Jean Hennebert

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The aims of the study described in this paper are (1) to assess the relative speaker discriminant properties of phonemes and (2) to investigate the importance of the temporal frame-to-frame information for speaker modelling in the framework of a text-prompted speaker verification system using hidden Markov models (HMMs) and multilayer perceptrons (MLPs). It is shown that, with similar experimental conditions, nasals, fricatives and vowels convey more speaker specific information than plosives and liquids. Regarding the influence of the frame-to-frame temporal information, significant improvements are reported from the inclusion of several acoustic frames at the input of the MLPs. The results tend also to show that each phoneme has its optimal MLP context size giving the best equal error rate (EER).

Original languageEnglish
Title of host publicationProceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 1998
Pages777-780
Number of pages4
DOIs
Publication statusPublished - 1 Dec 1998
Externally publishedYes
Event1998 23rd IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 1998 - Seattle, WA, United States
Duration: 12 May 199815 May 1998

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2
ISSN (Print)1520-6149

Conference

Conference1998 23rd IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 1998
Country/TerritoryUnited States
CitySeattle, WA
Period12/05/9815/05/98

Fingerprint

Dive into the research topics of 'Text-prompted speaker verification experiments with phoneme specific MLPs'. Together they form a unique fingerprint.

Cite this