Streaming random patches for evolving data stream classification

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Ensemble methods are a popular choice for learning from evolving data streams. This popularity is due to (i) the ability to simulate simple, yet, successful ensemble learning strategies, such as bagging and random forests; (ii) the possibility of incorporating drift detection and recovery in conjunction to the ensemble algorithm; (iii) the availability of efficient incremental base learners, such as Hoeffding Trees. In this work, we introduce the Streaming Random Patches (SRP) algorithm, an ensemble method specially adapted to stream classification which combines random subspaces and online bagging. We provide theoretical insights and empirical results illustrating different aspects of SRP. In particular, we explain how the widely adopted incremental Hoeffding trees are not, in fact, unstable learners, unlike their batch counterparts, and how this fact significantly influences ensemble methods design and performance. We compare SRP against state-of-the-art ensemble variants for streaming data in a multitude of datasets. The results show how SRP produce a high predictive performance for both real and synthetic datasets. Besides, we analyze the diversity over time and the average tree depth, which provides insights on the differences between local subspace randomization (as in random forest) and global subspace randomization (as in random subspaces).

Original languageEnglish
Title of host publicationProceedings - 19th IEEE International Conference on Data Mining, ICDM 2019
EditorsJianyong Wang, Kyuseok Shim, Xindong Wu
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages240-249
Number of pages10
ISBN (Electronic)9781728146034
DOIs
Publication statusPublished - 1 Nov 2019
Event19th IEEE International Conference on Data Mining, ICDM 2019 - Beijing, China
Duration: 8 Nov 201911 Nov 2019

Publication series

NameProceedings - IEEE International Conference on Data Mining, ICDM
Volume2019-November
ISSN (Print)1550-4786

Conference

Conference19th IEEE International Conference on Data Mining, ICDM 2019
Country/TerritoryChina
CityBeijing
Period8/11/1911/11/19

Keywords

  • Ensemble Learning
  • Random Patches
  • Random Subspaces
  • Stream Data Mining

Fingerprint

Dive into the research topics of 'Streaming random patches for evolving data stream classification'. Together they form a unique fingerprint.

Cite this