TY - GEN
T1 - Data Stream Classification Using Random Feature Functions and Novel Method Combinations
AU - Read, Jesse
AU - Bifet, Albert
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015/12/2
Y1 - 2015/12/2
N2 - Data streams are being generated in a faster, bigger, and more commonplace manner. In this scenario, Hoeffding Trees are an established method for classification. Several extensions exist, including high-performing ensemble setups such as online and leveraging bagging. Also, k-nearest neighbours is a popular choice, with most extensions dealing with the inherent performance limitations over a potentially-infinite stream. At the same time, gradient descent methods are becoming increasingly popular, owing to the proliferation of interest and successes in deep learning. Although deep neural networks can learn incrementally, they have so far proved too sensitive to hyperparameter options and initial conditions to be considered an effective 'off-the-shelf' data streams solution. In this work, we look at combinations of Hoeffding trees, nearest neighbour, and gradient descent methods with a streaming preprocessing approach in the form of a random feature functions filter for additional predictive power. Our empirical evaluation yields positive results for the novel approaches that we experiment with, and also highlight important issues, and shed light on promising future directions in approaches to data stream classification.
AB - Data streams are being generated in a faster, bigger, and more commonplace manner. In this scenario, Hoeffding Trees are an established method for classification. Several extensions exist, including high-performing ensemble setups such as online and leveraging bagging. Also, k-nearest neighbours is a popular choice, with most extensions dealing with the inherent performance limitations over a potentially-infinite stream. At the same time, gradient descent methods are becoming increasingly popular, owing to the proliferation of interest and successes in deep learning. Although deep neural networks can learn incrementally, they have so far proved too sensitive to hyperparameter options and initial conditions to be considered an effective 'off-the-shelf' data streams solution. In this work, we look at combinations of Hoeffding trees, nearest neighbour, and gradient descent methods with a streaming preprocessing approach in the form of a random feature functions filter for additional predictive power. Our empirical evaluation yields positive results for the novel approaches that we experiment with, and also highlight important issues, and shed light on promising future directions in approaches to data stream classification.
U2 - 10.1109/Trustcom.2015.585
DO - 10.1109/Trustcom.2015.585
M3 - Conference contribution
AN - SCOPUS:84969174857
T3 - Proceedings - 14th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, TrustCom 2015
SP - 211
EP - 216
BT - Proceedings - 9th IEEE International Conference on Big Data Science and Engineering, BigDataSE 2015
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 14th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, TrustCom 2015
Y2 - 20 August 2015 through 22 August 2015
ER -