TY - GEN
T1 - Analyzing and Repairing Concept Drift Adaptation in Data Stream Classification
AU - Halstead, Ben
AU - Koh, Yun Sing
AU - Riddle, Patricia
AU - Pears, Russel
AU - Pechenizkiy, Mykola
AU - Bifet, Albert
AU - Olivares, Gustavo
AU - Coulson, Guy
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/1/1
Y1 - 2021/1/1
N2 - Data collected over time often exhibit changes in distribution, or concept drift, caused by changes in hidden context relevant to the classification task, e.g. weather conditions. Adaptive learning methods are able to retain performance in changing conditions by explicitly detecting concept drift and changing the classifier used to make predictions. However, in real-world conditions, existing methods often select classifiers which poorly represent current data due to adaptation errors, where change in context is misidentified. We propose the AiRStream system, which uses a novel repair algorithm to identify and correct adaptation errors. We identify errors by periodically testing the performance of inactive classifiers. If an error is identified, a backtracking procedure repairs training done under the misidentified context. AiRStream achieves higher accuracy compared to baseline methods and selects classifiers which better match changes in context. A case study on a real-world air quality inference task shows that AiRStream is able to build a robust model of environmental conditions, allowing the adaptions made to concept drift to be analysed and related to changes in weather.
AB - Data collected over time often exhibit changes in distribution, or concept drift, caused by changes in hidden context relevant to the classification task, e.g. weather conditions. Adaptive learning methods are able to retain performance in changing conditions by explicitly detecting concept drift and changing the classifier used to make predictions. However, in real-world conditions, existing methods often select classifiers which poorly represent current data due to adaptation errors, where change in context is misidentified. We propose the AiRStream system, which uses a novel repair algorithm to identify and correct adaptation errors. We identify errors by periodically testing the performance of inactive classifiers. If an error is identified, a backtracking procedure repairs training done under the misidentified context. AiRStream achieves higher accuracy compared to baseline methods and selects classifiers which better match changes in context. A case study on a real-world air quality inference task shows that AiRStream is able to build a robust model of environmental conditions, allowing the adaptions made to concept drift to be analysed and related to changes in weather.
U2 - 10.1109/DSAA53316.2021.9564191
DO - 10.1109/DSAA53316.2021.9564191
M3 - Conference contribution
AN - SCOPUS:85126106974
T3 - 2021 IEEE 8th International Conference on Data Science and Advanced Analytics, DSAA 2021
BT - 2021 IEEE 8th International Conference on Data Science and Advanced Analytics, DSAA 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 8th IEEE International Conference on Data Science and Advanced Analytics, DSAA 2021
Y2 - 6 October 2021 through 9 October 2021
ER -