Abstract
Speech is composed of different sounds (acoustic segments). Speakers differ in their pronunciation of these sounds. The segmental approaches described in this paper are meant to exploit these differences for speaker verification purposes. For such approaches, the speech is divided into different classes, and the speaker modeling is done for each class. The speech segmentation applied is based on automatic language independent speech processing tools that provide a segmentation of the speech requiring neither phonetic nor orthographic transcriptions of the speech data. Two different speaker modeling approaches, based on multilayer perceptrons (MLPs) and on Gaussian mixture models (GMMs), are studied. The MLP-based segmental systems have performance comparable to that of the global MLP-based systems, and in the mismatched train-test conditions slightly better results are obtained with the segmental MLP system. The segmental GMM systems gave poorer results than the equivalent global GMM systems.
| Original language | English |
|---|---|
| Pages (from-to) | 198-212 |
| Number of pages | 15 |
| Journal | Digital Signal Processing: A Review Journal |
| Volume | 10 |
| Issue number | 1 |
| DOIs | |
| Publication status | Published - 1 Jan 2000 |
| Externally published | Yes |