TY - GEN
T1 - Anomaly ranking as supervised bipartite ranking
AU - Clémengon, Stéphan
AU - Robbiano, Sylvain
N1 - Publisher Copyright:
Copyright © (2014) by the International Machine Learning Society (IMLS) All rights reserved.
PY - 2014/1/1
Y1 - 2014/1/1
N2 - The Mass Volume (MV) curve is a visual tool to evaluate the performance of a scoring func-tion with regard to its capacity to rank data in the same order as the underlying density function. Anomaly ranking refers to the unsupervised learning task which consists in building a scoring function, based on unlabeled data, with a MV curve as low as possible at any point. In this paper, it is proved that, in the case where the data generating probability distribution has compact support, anomaly ranking is equivalent to (supervised) bipartite ranking, where the goal is to discriminate between the underlying probability distribution and the uniform distribution with same support. In this situation, the MV curve can be then seen as a simple transform of the corresponding ROC curve. Exploiting this view, we then show how to use bipartite ranking algorithms, possibly combined with random sampling, to solve the MV curve minimization problem. Numerical experiments based on a variety of bipartite ranking algorithms well-documented in the literature are displayed in order to illustrate the relevance of our approach.
AB - The Mass Volume (MV) curve is a visual tool to evaluate the performance of a scoring func-tion with regard to its capacity to rank data in the same order as the underlying density function. Anomaly ranking refers to the unsupervised learning task which consists in building a scoring function, based on unlabeled data, with a MV curve as low as possible at any point. In this paper, it is proved that, in the case where the data generating probability distribution has compact support, anomaly ranking is equivalent to (supervised) bipartite ranking, where the goal is to discriminate between the underlying probability distribution and the uniform distribution with same support. In this situation, the MV curve can be then seen as a simple transform of the corresponding ROC curve. Exploiting this view, we then show how to use bipartite ranking algorithms, possibly combined with random sampling, to solve the MV curve minimization problem. Numerical experiments based on a variety of bipartite ranking algorithms well-documented in the literature are displayed in order to illustrate the relevance of our approach.
M3 - Conference contribution
AN - SCOPUS:84919907259
T3 - 31st International Conference on Machine Learning, ICML 2014
SP - 1637
EP - 1645
BT - 31st International Conference on Machine Learning, ICML 2014
PB - International Machine Learning Society (IMLS)
T2 - 31st International Conference on Machine Learning, ICML 2014
Y2 - 21 June 2014 through 26 June 2014
ER -