TY - GEN
T1 - Adaptive Random Forests with Resampling for Imbalanced data Streams
AU - Boiko Ferreira, Luis Eduardo
AU - Murilo Gomes, Heitor
AU - Bifet, Albert
AU - Oliveira, Luiz S.
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/7/1
Y1 - 2019/7/1
N2 - The large volume of data generated by computer networks, smartphones, wearables and a wide range of sensors, which produce real-time data, are only useful if they can be efficiently processed so that individuals can make timely decisions based on them. In this context, machine learning techniques are widely used. While it performs better than humans in such tasks, every machine learning algorithm has a certain intrinsic bias, which means they assume that the data have specific characteristics, such as having a balanced distribution between classes. As many real-world applications present imbalanced traits in their data, this topic is gaining repercussion over time. In this work, we present the Adaptive Random Forest with Resampling (ARFRE), which is a classifier designed to deal with imbalanced datasets. ARFRE resample the instances based on the current class label distribution. We show through a set of extensive experiments on seven datasets that the proposed method can considerably improve the performance of the minority class(es) while avoiding degrading the performance in the majority class. On top of that, ARFRE is more efficient regarding execution time in comparison to the standard ARF algorithm.
AB - The large volume of data generated by computer networks, smartphones, wearables and a wide range of sensors, which produce real-time data, are only useful if they can be efficiently processed so that individuals can make timely decisions based on them. In this context, machine learning techniques are widely used. While it performs better than humans in such tasks, every machine learning algorithm has a certain intrinsic bias, which means they assume that the data have specific characteristics, such as having a balanced distribution between classes. As many real-world applications present imbalanced traits in their data, this topic is gaining repercussion over time. In this work, we present the Adaptive Random Forest with Resampling (ARFRE), which is a classifier designed to deal with imbalanced datasets. ARFRE resample the instances based on the current class label distribution. We show through a set of extensive experiments on seven datasets that the proposed method can considerably improve the performance of the minority class(es) while avoiding degrading the performance in the majority class. On top of that, ARFRE is more efficient regarding execution time in comparison to the standard ARF algorithm.
KW - adaptive random forest
KW - data streams
KW - ensemble
KW - imbalance
KW - resampling
U2 - 10.1109/IJCNN.2019.8852027
DO - 10.1109/IJCNN.2019.8852027
M3 - Conference contribution
AN - SCOPUS:85073235944
T3 - Proceedings of the International Joint Conference on Neural Networks
BT - 2019 International Joint Conference on Neural Networks, IJCNN 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2019 International Joint Conference on Neural Networks, IJCNN 2019
Y2 - 14 July 2019 through 19 July 2019
ER -