TY - GEN
T1 - Mining frequent closed graphs on evolving data streams
AU - Bifet, Albert
AU - Holmes, Geoff
AU - Pfahringer, Bernhard
AU - Gavaldà, Ricard
PY - 2011/1/1
Y1 - 2011/1/1
N2 - Graph mining is a challenging task by itself, and even more so when processing data streams which evolve in real-time. Data stream mining faces hard constraints regarding time and space for processing, and also needs to provide for concept drift detection. In this paper we present a framework for studying graph pattern mining on time-varying streams. Three new methods for mining frequent closed subgraphs are presented. All methods work on coresets of closed subgraphs, compressed representations of graph sets, and maintain these sets in a batch-incremental manner, but use different approaches to address potential concept drift. An evaluation study on datasets comprising up to four million graphs explores the strength and limitations of the proposed methods. To the best of our knowledge this is the first work on mining frequent closed subgraphs in non-stationary data streams.
AB - Graph mining is a challenging task by itself, and even more so when processing data streams which evolve in real-time. Data stream mining faces hard constraints regarding time and space for processing, and also needs to provide for concept drift detection. In this paper we present a framework for studying graph pattern mining on time-varying streams. Three new methods for mining frequent closed subgraphs are presented. All methods work on coresets of closed subgraphs, compressed representations of graph sets, and maintain these sets in a batch-incremental manner, but use different approaches to address potential concept drift. An evaluation study on datasets comprising up to four million graphs explores the strength and limitations of the proposed methods. To the best of our knowledge this is the first work on mining frequent closed subgraphs in non-stationary data streams.
KW - Closed mining
KW - Concept drift
KW - Data streams
KW - Graphs
UR - https://www.scopus.com/pages/publications/80052650039
U2 - 10.1145/2020408.2020501
DO - 10.1145/2020408.2020501
M3 - Conference contribution
AN - SCOPUS:80052650039
SN - 9781450308137
T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 591
EP - 599
BT - Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD'11
PB - Association for Computing Machinery
T2 - 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2011
Y2 - 21 August 2011 through 24 August 2011
ER -