TY - GEN
T1 - Comparing data-driven and phonetic N-gram systems for text-independent speaker verification
AU - El Hannani, Asmaa
AU - Petrovska-Delacrétaz, Dijana
PY - 2007/12/1
Y1 - 2007/12/1
N2 - Recognition of speaker identity based on modeling the streams produced by phonetic decoders (phonetic speaker recognition) has gained popularity during the past few years. Two of the major problems that arise when phone based systems are being developed are the possible mismatches between the development and evaluation data and the lack of transcribed databases. Data-driven segmentation techniques provide a potential solution to these problems because they do not use transcribed data and can easily be applied on development data minimizing the mismatches. In this paper we compare speaker recognition results using phonetic and data-driven decoders. To this end, we have compared the results obtained with two sets of speaker verification systems; the first one based on data-driven units and the second one on phonetic units. Results obtained on the NIST 2006 Speaker Recognition Evaluation data show that the data-driven approach is comparable to the phonetic one and that further improvements can be achieved by combining both approaches.
AB - Recognition of speaker identity based on modeling the streams produced by phonetic decoders (phonetic speaker recognition) has gained popularity during the past few years. Two of the major problems that arise when phone based systems are being developed are the possible mismatches between the development and evaluation data and the lack of transcribed databases. Data-driven segmentation techniques provide a potential solution to these problems because they do not use transcribed data and can easily be applied on development data minimizing the mismatches. In this paper we compare speaker recognition results using phonetic and data-driven decoders. To this end, we have compared the results obtained with two sets of speaker verification systems; the first one based on data-driven units and the second one on phonetic units. Results obtained on the NIST 2006 Speaker Recognition Evaluation data show that the data-driven approach is comparable to the phonetic one and that further improvements can be achieved by combining both approaches.
U2 - 10.1109/BTAS.2007.4401945
DO - 10.1109/BTAS.2007.4401945
M3 - Conference contribution
AN - SCOPUS:48649108511
SN - 9781424415977
T3 - IEEE Conference on Biometrics: Theory, Applications and Systems, BTAS'07
BT - IEEE Conference on Biometrics
T2 - 1st IEEE International Conference on Biometrics: Theory, Applications, and Systems, BTAS '07
Y2 - 27 September 2007 through 29 September 2007
ER -