Random histogram forest for unsupervised anomaly detection

Andrian Putina, Mauro Sozio, Dario Rossi, Jose M. Navarro

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Roughly speaking, anomaly detection consists of identifying instances whose features significantly deviate from the rest of input data. It is one of the most widely studied problems in unsupervised machine learning, boasting applications in network intrusion detection, healthcare and many others. Several methods have been developed in recent years, however, a satisfactory solution is still missing to the best of our knowledge. We present Random Histogram Forest an effective approach for unsupervised anomaly detection. Our approach is probabilistic, which has been proved to be effective in identifying anomalies. Moreover, it employs the fourth central moment (aka kurtosis), so as to identify potential anomalous instances. We conduct an extensive experimental evaluation on 38 datasets including all benchmarks for anomaly detection, as well as the most successful algorithms for unsupervised anomaly detection, to the best of our knowledge. We evaluate all the approaches in terms of the average precision of the area under the precision-recall curve (AP). Our evaluation shows that our approach significantly outperforms all other approaches in terms of AP while boasting linear running time.

Original languageEnglish
Title of host publicationProceedings - 20th IEEE International Conference on Data Mining, ICDM 2020
EditorsClaudia Plant, Haixun Wang, Alfredo Cuzzocrea, Carlo Zaniolo, Xindong Wu
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1226-1231
Number of pages6
ISBN (Electronic)9781728183169
DOIs
Publication statusPublished - 1 Nov 2020
Externally publishedYes
Event20th IEEE International Conference on Data Mining, ICDM 2020 - Virtual, Sorrento, Italy
Duration: 17 Nov 202020 Nov 2020

Publication series

NameProceedings - IEEE International Conference on Data Mining, ICDM
Volume2020-November
ISSN (Print)1550-4786

Conference

Conference20th IEEE International Conference on Data Mining, ICDM 2020
Country/TerritoryItaly
CityVirtual, Sorrento
Period17/11/2020/11/20

Keywords

  • N/a

Fingerprint

Dive into the research topics of 'Random histogram forest for unsupervised anomaly detection'. Together they form a unique fingerprint.

Cite this