TY - GEN
T1 - A sat-based approach for mining high utility itemsets from transaction databases
AU - Hidouri, Amel
AU - Jabbour, Said
AU - Raddaoui, Badran
AU - Yaghlane, Boutheina Ben
N1 - Publisher Copyright:
© Springer Nature Switzerland AG 2020.
PY - 2020/1/1
Y1 - 2020/1/1
N2 - Mining high utility itemsets is a keystone in several data analysis tasks. High Utility Itemset Mining generalizes the frequent itemset mining problem by considering item quantities and weights. A high utility itemset is a set of items that appears in the transadatabase and having a high importance to the user, measured by a utility function. The utility of a pattern can be quantified in terms of various objective criteria, e.g., profit, frequency, and weight. Constraint Programming (CP) and Propositional Satisfiability (SAT) based frameworks for modeling and solving pattern mining tasks have gained a considerable attention in recent few years. This paper introduces the first declarative framework for mining high utility itemsets from transaction databases. First, we model the problem of mining high utility itemsets from transaction databases as a propositional satifiability problem. Moreover, to facilitate the mining task, we add an additional constraint to the efficiency of our method by using weighted clique cover problem. Then, we exploit the efficient SAT solving techniques to output all the high utility itemsets in the data that satisfy a user-specified minimum support and minimum utility values. Experimental evaluations on real and synthetic datasets show that the performance of our proposed approach is close to that of the optimal case of state-of-the-art HUIM algorithms.
AB - Mining high utility itemsets is a keystone in several data analysis tasks. High Utility Itemset Mining generalizes the frequent itemset mining problem by considering item quantities and weights. A high utility itemset is a set of items that appears in the transadatabase and having a high importance to the user, measured by a utility function. The utility of a pattern can be quantified in terms of various objective criteria, e.g., profit, frequency, and weight. Constraint Programming (CP) and Propositional Satisfiability (SAT) based frameworks for modeling and solving pattern mining tasks have gained a considerable attention in recent few years. This paper introduces the first declarative framework for mining high utility itemsets from transaction databases. First, we model the problem of mining high utility itemsets from transaction databases as a propositional satifiability problem. Moreover, to facilitate the mining task, we add an additional constraint to the efficiency of our method by using weighted clique cover problem. Then, we exploit the efficient SAT solving techniques to output all the high utility itemsets in the data that satisfy a user-specified minimum support and minimum utility values. Experimental evaluations on real and synthetic datasets show that the performance of our proposed approach is close to that of the optimal case of state-of-the-art HUIM algorithms.
KW - Constraint programming
KW - Data mining
KW - High utility
KW - Maximal clique
KW - Propositional satisfiabilty
UR - https://www.scopus.com/pages/publications/85091528531
U2 - 10.1007/978-3-030-59065-9_8
DO - 10.1007/978-3-030-59065-9_8
M3 - Conference contribution
AN - SCOPUS:85091528531
SN - 9783030590642
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 91
EP - 106
BT - Big Data Analytics and Knowledge Discovery - 22nd International Conference, DaWaK 2020, Proceedings
A2 - Song, Min
A2 - Song, Il-Yeol
A2 - Kotsis, Gabriele
A2 - Khalil, Ismail
A2 - Tjoa, A Min
PB - Springer Science and Business Media Deutschland GmbH
T2 - 22nd International Conference on Big Data Analytics and Knowledge Discovery, DaWaK 2020
Y2 - 14 September 2020 through 17 September 2020
ER -