TY - GEN
T1 - Exploiting high-level information provided by ALISP in speaker recognition
AU - El Hannani, Asmaa
AU - Petrovska-Delacrétaz, Dijana
PY - 2005/12/1
Y1 - 2005/12/1
N2 - The best performing systems in the area of automatic speaker recognition have focused on using short-term, low-level acoustic information, such as cepstral features. Recently, various works have demonstrated that high-level features convey more speaker information and can be added to the low-level features in order to increase the robustness of the system. This paper describes a text-independent speaker recognition system exploiting high-level information provided by ALISP (Automatic Language Independent Speech Processing), a data-driven segmentation. This system, denoted here as ALISP n-gram system, captures the speaker specific information only by analyzing sequences of ALISP units. The ALISP n-gram system was fused with an acoustic ALISP-based Gaussian Mixture Models (GMM) system exploiting the speaker discriminating properties of individual speech classes. The resulting fused system reduced the error rate over the individual systems on the NIST 2004 Speaker Recognition Evaluation data.
AB - The best performing systems in the area of automatic speaker recognition have focused on using short-term, low-level acoustic information, such as cepstral features. Recently, various works have demonstrated that high-level features convey more speaker information and can be added to the low-level features in order to increase the robustness of the system. This paper describes a text-independent speaker recognition system exploiting high-level information provided by ALISP (Automatic Language Independent Speech Processing), a data-driven segmentation. This system, denoted here as ALISP n-gram system, captures the speaker specific information only by analyzing sequences of ALISP units. The ALISP n-gram system was fused with an acoustic ALISP-based Gaussian Mixture Models (GMM) system exploiting the speaker discriminating properties of individual speech classes. The resulting fused system reduced the error rate over the individual systems on the NIST 2004 Speaker Recognition Evaluation data.
M3 - Conference contribution
AN - SCOPUS:33745472612
SN - 3540312579
SN - 9783540312574
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 66
EP - 71
BT - Nonlinear Analyses and Algorithms for Speech Processing - International Conference on Non-Linear Speech Processing, NOLISP 2005, Revised Selected Papers
T2 - International Conference on Non-Linear Speech Processing, NOLISP 2005
Y2 - 19 April 2005 through 22 April 2005
ER -