TY - GEN
T1 - Big data stream learning with SAMOA
AU - Bifet, Albert
AU - Morales, Gianmarco De Francisci
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2015/1/26
Y1 - 2015/1/26
N2 - Big data is flowing into every area of our life, professional and personal. Big data is defined as datasets whose size is beyond the ability of typical software tools to capture, store, manage and analyze, due to the time and memory complexity. Velocity is one of the main properties of big data. In this demo, we present SAMOA (Scalable Advanced Massive Online Analysis), an open-source platform for mining big data streams. It provides a collection of distributed streaming algorithms for the most common data mining and machine learning tasks such as classification, clustering, and regression, as well as programming abstractions to develop new algorithms. It features a pluggable architecture that allows it to run on several distributed stream processing engines such as Storm, S4, and Samza. SAMOA is written in Java and is available at http://samoa-project.net under the Apache Software License version 2.0.
AB - Big data is flowing into every area of our life, professional and personal. Big data is defined as datasets whose size is beyond the ability of typical software tools to capture, store, manage and analyze, due to the time and memory complexity. Velocity is one of the main properties of big data. In this demo, we present SAMOA (Scalable Advanced Massive Online Analysis), an open-source platform for mining big data streams. It provides a collection of distributed streaming algorithms for the most common data mining and machine learning tasks such as classification, clustering, and regression, as well as programming abstractions to develop new algorithms. It features a pluggable architecture that allows it to run on several distributed stream processing engines such as Storm, S4, and Samza. SAMOA is written in Java and is available at http://samoa-project.net under the Apache Software License version 2.0.
KW - Classification
KW - Clustering
KW - Data Streams
KW - Distributed Systems
KW - Machine Learning
KW - Regression
KW - Toolbox
UR - https://www.scopus.com/pages/publications/84936889401
U2 - 10.1109/ICDMW.2014.24
DO - 10.1109/ICDMW.2014.24
M3 - Conference contribution
AN - SCOPUS:84936889401
T3 - IEEE International Conference on Data Mining Workshops, ICDMW
SP - 1199
EP - 1202
BT - Proceedings - 14th IEEE International Conference on Data Mining Workshops, ICDMW 2014
A2 - Zhou, Zhi-Hua
A2 - Wang, Wei
A2 - Kumar, Ravi
A2 - Toivonen, Hannu
A2 - Pei, Jian
A2 - Zhexue Huang, Joshua
A2 - Wu, Xindong
PB - IEEE Computer Society
T2 - 14th IEEE International Conference on Data Mining Workshops, ICDMW 2014
Y2 - 14 December 2014
ER -