TY - GEN
T1 - Fast and lightweight binary and multi-branch Hoeffding Tree Regressors
AU - Mastelini, Saulo Martiello
AU - Montiel, Jacob
AU - Gomes, Heitor Murilo
AU - Bifet, Albert
AU - Pfahringer, Bernhard
AU - De Carvalho, Andre C.P.L.F.
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/1/1
Y1 - 2021/1/1
N2 - Incremental Hoeffding Tree Regressors (HTR) are powerful non-linear online learning tools. However, the commonly used strategy to build such structures limits their applicability to real-time scenarios. In this paper, we expand and evaluate Quantization Observer (QO), a feature discretization-based tool to speed up incremental regression tree construction and save memory resources. We enhance the original QO proposal to create multi-branch trees when dealing with numerical attributes, creating a mix of interval and binary splits rather than binary splits only. We evaluate the multi-branch and strictly binary QO-based HTRs against other tree-building strategies in an extensive experimental setup of 15 data streams. In general, the QO-based HTRs are as accurate as traditional HTRs, incurring one-third of training time at only a fraction of the memory resource usage. The obtained numerical multi-branch HTRs are shallower than the strictly binary ones, significantly faster to train, and they keep predictive performance similar to the traditional incremental trees.
AB - Incremental Hoeffding Tree Regressors (HTR) are powerful non-linear online learning tools. However, the commonly used strategy to build such structures limits their applicability to real-time scenarios. In this paper, we expand and evaluate Quantization Observer (QO), a feature discretization-based tool to speed up incremental regression tree construction and save memory resources. We enhance the original QO proposal to create multi-branch trees when dealing with numerical attributes, creating a mix of interval and binary splits rather than binary splits only. We evaluate the multi-branch and strictly binary QO-based HTRs against other tree-building strategies in an extensive experimental setup of 15 data streams. In general, the QO-based HTRs are as accurate as traditional HTRs, incurring one-third of training time at only a fraction of the memory resource usage. The obtained numerical multi-branch HTRs are shallower than the strictly binary ones, significantly faster to train, and they keep predictive performance similar to the traditional incremental trees.
KW - Hoeffding tree regressor
KW - computational resource savings
KW - incremental learning
KW - online learning
U2 - 10.1109/ICDMW53433.2021.00053
DO - 10.1109/ICDMW53433.2021.00053
M3 - Conference contribution
AN - SCOPUS:85125335807
T3 - IEEE International Conference on Data Mining Workshops, ICDMW
SP - 380
EP - 388
BT - Proceedings - 21st IEEE International Conference on Data Mining Workshops, ICDMW 2021
A2 - Xue, Bing
A2 - Pechenizkiy, Mykola
A2 - Koh, Yun Sing
PB - IEEE Computer Society
T2 - 21st IEEE International Conference on Data Mining Workshops, ICDMW 2021
Y2 - 7 December 2021 through 10 December 2021
ER -