TY - GEN
T1 - Kalman Filtering for Learning with Evolving Data Streams
AU - Ziffer, Giacomo
AU - Bernardo, Alessio
AU - Valle, Emanuele Della
AU - Bifet, Albert
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/1/1
Y1 - 2021/1/1
N2 - Processing data streams gained much importance in recent years. Standard machine learning algorithms do not cope well with non-stationary streaming data, where decision models evolve and generate so-called concept drift. Online adaptive algorithms emerged to solve these issues. They learn incrementally and generally require explicit forgetting mechanisms to adapt to concept drift. In this paper, we propose the application of Kalman filtering to handle evolving data streams. This novel approach addresses data stream mining and concept drift management challenges from a new perspective, directly modelling a representation suitable for the data streams. First, we study a Kalman filter based learning a pproach and investigate its integration into the Naïve Bayes algorithm, namely KalmanNB. Additionally, we propose the Hoeffding Kalman Tree, a combination of the Hoeffding Tree with KalmanNB. Empirical results demonstrate that the Kalman filter based approach inherently manages concept drifts, and it adapts to the emerging concept more rapidly than the state-of-the-art algorithms. Moreover, it is an accurate and robust approach and requires less storage while still being faster.
AB - Processing data streams gained much importance in recent years. Standard machine learning algorithms do not cope well with non-stationary streaming data, where decision models evolve and generate so-called concept drift. Online adaptive algorithms emerged to solve these issues. They learn incrementally and generally require explicit forgetting mechanisms to adapt to concept drift. In this paper, we propose the application of Kalman filtering to handle evolving data streams. This novel approach addresses data stream mining and concept drift management challenges from a new perspective, directly modelling a representation suitable for the data streams. First, we study a Kalman filter based learning a pproach and investigate its integration into the Naïve Bayes algorithm, namely KalmanNB. Additionally, we propose the Hoeffding Kalman Tree, a combination of the Hoeffding Tree with KalmanNB. Empirical results demonstrate that the Kalman filter based approach inherently manages concept drifts, and it adapts to the emerging concept more rapidly than the state-of-the-art algorithms. Moreover, it is an accurate and robust approach and requires less storage while still being faster.
KW - Concept Drift Management
KW - Hoeffding Tree
KW - Kalman Filter
KW - Naïve Bayes
KW - Online Learning
U2 - 10.1109/BigData52589.2021.9671365
DO - 10.1109/BigData52589.2021.9671365
M3 - Conference contribution
AN - SCOPUS:85125313335
T3 - Proceedings - 2021 IEEE International Conference on Big Data, Big Data 2021
SP - 5337
EP - 5346
BT - Proceedings - 2021 IEEE International Conference on Big Data, Big Data 2021
A2 - Chen, Yixin
A2 - Ludwig, Heiko
A2 - Tu, Yicheng
A2 - Fayyad, Usama
A2 - Zhu, Xingquan
A2 - Hu, Xiaohua Tony
A2 - Byna, Suren
A2 - Liu, Xiong
A2 - Zhang, Jianping
A2 - Pan, Shirui
A2 - Papalexakis, Vagelis
A2 - Wang, Jianwu
A2 - Cuzzocrea, Alfredo
A2 - Ordonez, Carlos
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2021 IEEE International Conference on Big Data, Big Data 2021
Y2 - 15 December 2021 through 18 December 2021
ER -