Résumé
In a wide variety of situations, anomalies in the behaviour of a complex system, whose health is monitored through the observation of a random vector X=(X1,…,Xd) valued in Rd, correspond to the simultaneous occurrence of extreme values for certain subgroups α⊂{1,…,d} of variables Xj. Under the heavy-tail assumption, which is precisely appropriate for modeling these phenomena, statistical methods relying on multivariate extreme value theory have been developed in the past few years for identifying such events/subgroups. This paper exploits this approach much further by means of a novel mixture model that permits to describe the distribution of extremal observations and where the anomaly type α is viewed as a latent variable. One may then take advantage of the model by assigning to any extreme point a posterior probability for each anomaly type α, defining implicitly a similarity measure between anomalies. It is explained at length how the latter permits to cluster extreme observations and obtain an informative planar representation of anomalies using standard graph-mining tools. The relevance and usefulness of the clustering and 2-d visual display thus designed is illustrated on simulated datasets and on real observations as well, in the aeronautics application domain.
| langue originale | Anglais |
|---|---|
| Pages (de - à) | 607-628 |
| Nombre de pages | 22 |
| journal | Computational Statistics |
| Volume | 35 |
| Numéro de publication | 2 |
| Les DOIs | |
| état | Publié - 1 juin 2020 |
Empreinte digitale
Examiner les sujets de recherche de « A multivariate extreme value theory approach to anomaly clustering and visualization ». Ensemble, ils forment une empreinte digitale unique.Contient cette citation
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver