TY - GEN
T1 - A mutual information kernel for sequences
AU - Cuturi, Marco
AU - Vert, Jean Philippe
PY - 2004/12/1
Y1 - 2004/12/1
N2 - We propose a new kernel for strings which borrows ideas and techniques from information theory and data compression. This kernel can be used in combination with any kernel method, in particular Support Vector Machines for protein classification. By incorporating prior assumptions on the properties of the alphabet and using a Bayesian averaging framework, we compute the value of this kernel in linear time and space, benefiting from previous achievements proposed in the field of universal coding. Encouraging classification results are reported on a standard protein homology detection experiment.
AB - We propose a new kernel for strings which borrows ideas and techniques from information theory and data compression. This kernel can be used in combination with any kernel method, in particular Support Vector Machines for protein classification. By incorporating prior assumptions on the properties of the alphabet and using a Bayesian averaging framework, we compute the value of this kernel in linear time and space, benefiting from previous achievements proposed in the field of universal coding. Encouraging classification results are reported on a standard protein homology detection experiment.
U2 - 10.1109/IJCNN.2004.1380902
DO - 10.1109/IJCNN.2004.1380902
M3 - Conference contribution
AN - SCOPUS:10844292588
SN - 0780383591
T3 - IEEE International Conference on Neural Networks - Conference Proceedings
SP - 1905
EP - 1910
BT - 2004 IEEE International Joint Conference on Neural Networks - Proceedings
T2 - 2004 IEEE International Joint Conference on Neural Networks - Proceedings
Y2 - 25 July 2004 through 29 July 2004
ER -