Passer à la navigation principale Passer à la recherche Passer au contenu principal

Consistent algorithms for clustering time series

  • Department of Mathematics and Statistics, Lancaster University
  • INRIA Institut National de Recherche en Informatique et en Automatique
  • Université de Lille

Résultats de recherche: Contribution à un journalArticleRevue par des pairs

Résumé

The problem of clustering is considered for the case where every point is a time series. The time series are either given in one batch (offline setting), or they are allowed to grow with time and new time series can be added along the way (online setting). We propose a natural notion of consistency for this problem, and show that there are simple, computationally eficient algorithms that are asymptotically consistent under extremely weak assumptions on the distributions that generate the data. The notion of consistency is as follows. A clustering algorithm is called consistent if it places two time series into the same cluster if and only if the distribution that generates them is the same. In the considered framework the time series are allowed to be highly dependent, and the dependence can have arbitrary form. If the number of clusters is known, the only assumption we make is that the (marginal) distribution of each time series is stationary ergodic. No parametric, memory or mixing assumptions are made. When the number of clusters is unknown, stronger assumptions are provably necessary, but it is still possible to devise nonparametric algorithms that are consistent under very general conditions. The theoretical findings of this work are illustrated with experiments on both synthetic and real data.

langue originaleAnglais
journalJournal of Machine Learning Research
Volume17
étatPublié - 1 janv. 2016
Modification externeOui

Empreinte digitale

Examiner les sujets de recherche de « Consistent algorithms for clustering time series ». Ensemble, ils forment une empreinte digitale unique.

Contient cette citation