SLEADE: Disagreement-Based Semi-Supervised Learning for Sparsely Labeled Evolving Data Streams

  • Heitor Murilo Gomes
  • , Jesse Read
  • , Maciej Grzenda
  • , Bernhard Pfahringer
  • , Albert Bifet

Research output: Contribution to journalArticlepeer-review

Abstract

Semi-supervised learning (SSL) problems are challenging, appear in many domains, and are particularly relevant to streaming applications, where data are abundant but labels are not. The problem tackled here is classification over an evolving data stream where labels are rare and distributed randomly. We propose SLEADE (Stream LEArning by Disagreement Ensemble), a novel method that exploits disagreement-based learning and unsupervised drift detection to leverage unlabeled data during training. SLEADE uses pseudo-labeled instances to augment the training set of each member of an ensemble using a majority trains the minority scheme. The pseudo-labeled data impact is controlled by a weighting function that considers the confidence in the prediction attributed by the ensemble members. SLEADE exploits unsupervised drift detection, which allows the ensemble to respond to changes. We present several experiments using real and synthetic data to illustrate the benefits and limitations of SLEADE compared to existing algorithms.

Original languageEnglish
Pages (from-to)1973-1985
Number of pages13
JournalIEEE Transactions on Knowledge and Data Engineering
Volume38
Issue number3
DOIs
Publication statusPublished - 1 Jan 2026
Externally publishedYes

Keywords

  • Data stream
  • concept drift
  • disagreement-based learning
  • semi-supervised learning (SSL)

Fingerprint

Dive into the research topics of 'SLEADE: Disagreement-Based Semi-Supervised Learning for Sparsely Labeled Evolving Data Streams'. Together they form a unique fingerprint.

Cite this