TY - GEN
T1 - Using data-driven and phonetic units for speaker verification
AU - Hannani, Asmaa El
AU - Toledano, Doroteo T.
AU - Petrovska-Delacrétaz, Dijana
AU - Montero-Asenjo, Alberto
AU - Hennebert, Jean
PY - 2006/12/1
Y1 - 2006/12/1
N2 - Recognition of speaker identity based on modeling the streams produced by phonetic decoders (phonetic speaker recognition) has gained popularity during the past few years. Two of the major problems that arise when phone based systems are being developed are the possible mismatches between the development and evaluation data and the lack of transcribed databases. Data-driven segmentation techniques provide a potential solution to these problems because they do not use transcribed data and can easily be applied on development data minimizing the mismatches. In this paper we compare speaker recognition results using phonetic and data-driven decoders. To this end, we have compared the results obtained with a speaker recognition system based on data-driven acoustic units and phonetic speaker recognition systems trained on Spanish and English data. Results obtained on the NIST 2005 Speaker Recognition Evaluation data show that the data-driven approach outperforms the phonetic one and that further improvements can be achieved by combining both approaches.
AB - Recognition of speaker identity based on modeling the streams produced by phonetic decoders (phonetic speaker recognition) has gained popularity during the past few years. Two of the major problems that arise when phone based systems are being developed are the possible mismatches between the development and evaluation data and the lack of transcribed databases. Data-driven segmentation techniques provide a potential solution to these problems because they do not use transcribed data and can easily be applied on development data minimizing the mismatches. In this paper we compare speaker recognition results using phonetic and data-driven decoders. To this end, we have compared the results obtained with a speaker recognition system based on data-driven acoustic units and phonetic speaker recognition systems trained on Spanish and English data. Results obtained on the NIST 2005 Speaker Recognition Evaluation data show that the data-driven approach outperforms the phonetic one and that further improvements can be achieved by combining both approaches.
U2 - 10.1109/ODYSSEY.2006.248134
DO - 10.1109/ODYSSEY.2006.248134
M3 - Conference contribution
AN - SCOPUS:42749106778
SN - 142440472X
SN - 9781424404728
T3 - IEEE Odyssey 2006: Workshop on Speaker and Language Recognition
BT - IEEE Odyssey 2006
T2 - IEEE Odyssey 2006: Workshop on Speaker and Language Recognition
Y2 - 28 June 2006 through 30 June 2006
ER -