Tree-based Kendall's τ Maximization for Explainable Unsupervised Anomaly Detection

Lanfang Kong, Alexis Huet, Dario Rossi, Mauro Sozio

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We study the problem of building a regression tree with relatively small size, which maximizes the Kendall's tau coefficient between the anomaly scores of a source anomaly detection algorithm and those predicted by our regression tree. We consider a labeling function which assigns to each leaf the inverse of its size, thereby providing satisfactory explanations when comparing examples with different anomaly scores. We show that our approach can be used as a post-hoc model, i.e. to provide global explanations for an existing anomaly detection algorithm. Moreover, it can be used as an in-model approach, i.e. the source anomaly detection algorithm can be replaced all together. This is made possible by leveraging the off-the-shelf transparency of tree-based approaches and from the fact that the explanations provided by our approach do not rely on the source anomaly detection algorithm. The main technical challenge to tackle is the efficient computation of the Kendall's tau coefficients when determining the best split at each node of the regression tree. We show how such a coefficient can be computed incrementally, thereby making the running time of our algorithm almost linear (up to a logarithmic factor) in the size of the input. Our approach is completely unsupervised, which is appealing in the case when it is difficult to collect a large number of labeled examples. We complement our study with an extensive experimental evaluation against the state-of-the-art, showing the effectiveness of our approach.

Original languageEnglish
Title of host publicationProceedings - 23rd IEEE International Conference on Data Mining, ICDM 2023
EditorsGuihai Chen, Latifur Khan, Xiaofeng Gao, Meikang Qiu, Witold Pedrycz, Xindong Wu
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1073-1078
Number of pages6
ISBN (Electronic)9798350307887
DOIs
Publication statusPublished - 1 Jan 2023
Externally publishedYes
Event23rd IEEE International Conference on Data Mining, ICDM 2023 - Shanghai, China
Duration: 1 Dec 20234 Dec 2023

Publication series

NameProceedings - IEEE International Conference on Data Mining, ICDM
ISSN (Print)1550-4786

Conference

Conference23rd IEEE International Conference on Data Mining, ICDM 2023
Country/TerritoryChina
CityShanghai
Period1/12/234/12/23

Keywords

  • n/a

Fingerprint

Dive into the research topics of 'Tree-based Kendall's τ Maximization for Explainable Unsupervised Anomaly Detection'. Together they form a unique fingerprint.

Cite this