Learning to Rank Anomalies: Scalar Performance Criteria and Maximization of Two-Sample Rank Statistics

Research output: Contribution to journalConference articlepeer-review

Abstract

The ability to collect and store ever more massive databases has been accompanied by the need to process them efficiently. In many cases, most observations have the same behavior, while a probable small proportion of these observations are abnormal. Detecting the latter, defined as outliers, is one of the major challenges for machine learning applications (e.g. in fraud detection or in predictive maintenance). In this paper, we propose a methodology addressing the problem of outlier detection, by learning a data-driven scoring function defined on the feature space which reflects the degree of abnormality of the observations. This scoring function is learnt through a well-designed binary classification problem whose empirical criterion takes the form of a two-sample linear rank statistics on which theoretical results are available. We illustrate our methodology with preliminary encouraging numerical experiments.

Original languageEnglish
Pages (from-to)63-75
Number of pages13
JournalProceedings of Machine Learning Research
Volume154
Publication statusPublished - 1 Jan 2021
Event3rd International Workshop on Learning with Imbalanced Domains: Theoryand Applications, LIDTA 2021 - Virtual, Online, Spain
Duration: 17 Sept 2021 → …

Keywords

  • Anomaly ranking
  • novelty detection
  • two-sample linear rank statistics

Fingerprint

Dive into the research topics of 'Learning to Rank Anomalies: Scalar Performance Criteria and Maximization of Two-Sample Rank Statistics'. Together they form a unique fingerprint.

Cite this