Passer à la navigation principale Passer à la recherche Passer au contenu principal

Clustering based active learning for evolving data streams

  • Dino Ienco
  • , Albert Bifet
  • , Indre Žliobaite
  • , Bernhard Pfahringer
  • Christian Puech Chercheur Ééerite, Irs Tea, UMR TETIS
  • DALI/LIRMM
  • Yahoo Research Barcelona
  • Aalto University
  • University of Waikato

Résultats de recherche: Le chapitre dans un livre, un rapport, une anthologie ou une collectionContribution à une conférenceRevue par des pairs

Résumé

Data labeling is an expensive and time-consuming task. Choosing which labels to use is increasingly becoming important. In the active learning setting, a classifier is trained by asking for labels for only a small fraction of all instances. While many works exist that deal with this issue in non-streaming scenarios, few works exist in the data stream setting. In this paper we propose a new active learning approach for evolving data streams based on a pre-clustering step, for selecting the most informative instances for labeling. We consider a batch incremental setting: when a new batch arrives, first we cluster the examples, and then, we select the best instances to train the learner. The clustering approach allows to cover the whole data space avoiding to oversample examples from only few areas. We compare our method w.r.t. state of the art active learning strategies over real datasets. The results highlight the improvement in performance of our proposal. Experiments on parameter sensitivity are also reported.

langue originaleAnglais
titreDiscovery Science - 16th International Conference, DS 2013, Proceedings
EditeurSpringer Verlag
Pages79-93
Nombre de pages15
ISBN (imprimé)9783642408960
Les DOIs
étatPublié - 1 janv. 2013
Modification externeOui
Evénement16th International Conference on Discovery Science, DS 2013 - Singapore, Singapour
Durée: 6 oct. 20139 oct. 2013

Série de publications

NomLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume8140 LNAI
ISSN (imprimé)0302-9743
ISSN (Electronique)1611-3349

Une conférence

Une conférence16th International Conference on Discovery Science, DS 2013
Pays/TerritoireSingapour
La villeSingapore
période6/10/139/10/13

Empreinte digitale

Examiner les sujets de recherche de « Clustering based active learning for evolving data streams ». Ensemble, ils forment une empreinte digitale unique.

Contient cette citation