Passer à la navigation principale Passer à la recherche Passer au contenu principal

Tuple reconstruction

  • Ngurah Agus Sanjaya Er
  • , Mouhamadou Lamine Ba
  • , Talel Abdessalem
  • , Stéphane Bressan
  • CNRS
  • Université Alioune Diop de Bambey
  • National University of Singapore

Résultats de recherche: Le chapitre dans un livre, un rapport, une anthologie ou une collectionContribution à une conférenceRevue par des pairs

Résumé

Set of tuples expansion system (STEP) extracts information from the Web in the form of tuples. It builds a graph of entities consisting of Web pages, wrappers, seeds, domains, and candidates as its nodes while the relationships between them as edges. The final weight given for each node after running random walks on the graph is used to order the extracted candidates. Due to the nature of the regular expressions used as wrappers, some of the extracted candidates may contain “noise” and therefore can be considered as “false”. These false candidates may rank higher than the “true” ones on the list because they are extracted from many Web pages or produced by many different wrappers. Minimizing these false candidates is necessary to ensure the validity of the result presented. In this research, we propose a method to tackle the aforementioned problem of STEP by reconstructing tuples. We begin with extracting binary tuples from the Web. These binary tuples consist of a key attribute and a property of the attribute. To validate the truthfulness of the binary tuples, we apply truth-finding algorithms. This helps us in building a credible list of binary tuples. We propose two methods to reconstruct tuples from binary ones. We use the reconstructed tuples to enrich the graph of entities of STEP such that the “true” candidates receive more confidence and rank higher in the graph. We show that our approach is efficient and significantly improve the confidence level of the tuples extracted by STEP. We also conduct an experiment on a real-world case of populating a database relation from the Web with our proposed approach.

langue originaleAnglais
titreDatabase Systems for Advanced Applications - DASFAA 2018 International Workshops
Sous-titreBDMS, BDQM, GDMA, and SeCoP, Proceedings
rédacteurs en chefJianxin Li, Lei Zou, Chengfei Liu
EditeurSpringer Verlag
Pages239-254
Nombre de pages16
ISBN (imprimé)9783319914541
Les DOIs
étatPublié - 1 janv. 2018
Evénement23rd International Conference on Database Systems for Advanced Applications, DASFAA 2018 - Gold Coast, Australie
Durée: 21 mai 201824 mai 2018

Série de publications

NomLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10829 LNCS
ISSN (imprimé)0302-9743
ISSN (Electronique)1611-3349

Une conférence

Une conférence23rd International Conference on Database Systems for Advanced Applications, DASFAA 2018
Pays/TerritoireAustralie
La villeGold Coast
période21/05/1824/05/18

Empreinte digitale

Examiner les sujets de recherche de « Tuple reconstruction ». Ensemble, ils forment une empreinte digitale unique.

Contient cette citation