Passer à la navigation principale Passer à la recherche Passer au contenu principal

On the Complexity of Mining Itemsets from the Crowd Using Taxonomies

  • Tel Aviv University

Résultats de recherche: Contribution à une conférencePapierRevue par des pairs

Résumé

We study the problem of frequent itemset mining in domains where data is not recorded in a conventional database but only exists in human knowledge. We provide examples of such scenarios, and present a crowdsourcing model for them. The model uses the crowd as an oracle to find out whether an itemset is frequent or not, and relies on a known taxonomy of the item domain to guide the search for frequent itemsets. In the spirit of data mining with oracles, we analyze the complexity of this problem in terms of (i) crowd complexity, that measures the number of crowd questions required to identify the frequent itemsets; and (ii) computational complexity, that measures the computational effort required to choose the questions. We provide lower and upper complexity bounds in terms of the size and structure of the input taxonomy, as well as the size of a concise description of the output itemsets. We also provide constructive algorithms that achieve the upper bounds, and consider more efficient variants for practical situations.

langue originaleAnglais
Pages15-25
Nombre de pages11
Les DOIs
étatPublié - 1 janv. 2014
Evénement17th International Conference on Database Theory, ICDT 2014 - Athens, Grcce
Durée: 24 mars 201428 mars 2014

Une conférence

Une conférence17th International Conference on Database Theory, ICDT 2014
Pays/TerritoireGrcce
La villeAthens
période24/03/1428/03/14

Empreinte digitale

Examiner les sujets de recherche de « On the Complexity of Mining Itemsets from the Crowd Using Taxonomies ». Ensemble, ils forment une empreinte digitale unique.

Contient cette citation