Skip to main navigation Skip to search Skip to main content

Tuple reconstruction

  • Ngurah Agus Sanjaya Er
  • , Mouhamadou Lamine Ba
  • , Talel Abdessalem
  • , Stéphane Bressan
  • Centre national de la recherche scientifique
  • Université Alioune Diop de Bambey
  • National University of Singapore

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Set of tuples expansion system (STEP) extracts information from the Web in the form of tuples. It builds a graph of entities consisting of Web pages, wrappers, seeds, domains, and candidates as its nodes while the relationships between them as edges. The final weight given for each node after running random walks on the graph is used to order the extracted candidates. Due to the nature of the regular expressions used as wrappers, some of the extracted candidates may contain “noise” and therefore can be considered as “false”. These false candidates may rank higher than the “true” ones on the list because they are extracted from many Web pages or produced by many different wrappers. Minimizing these false candidates is necessary to ensure the validity of the result presented. In this research, we propose a method to tackle the aforementioned problem of STEP by reconstructing tuples. We begin with extracting binary tuples from the Web. These binary tuples consist of a key attribute and a property of the attribute. To validate the truthfulness of the binary tuples, we apply truth-finding algorithms. This helps us in building a credible list of binary tuples. We propose two methods to reconstruct tuples from binary ones. We use the reconstructed tuples to enrich the graph of entities of STEP such that the “true” candidates receive more confidence and rank higher in the graph. We show that our approach is efficient and significantly improve the confidence level of the tuples extracted by STEP. We also conduct an experiment on a real-world case of populating a database relation from the Web with our proposed approach.

Original languageEnglish
Title of host publicationDatabase Systems for Advanced Applications - DASFAA 2018 International Workshops
Subtitle of host publicationBDMS, BDQM, GDMA, and SeCoP, Proceedings
EditorsJianxin Li, Lei Zou, Chengfei Liu
PublisherSpringer Verlag
Pages239-254
Number of pages16
ISBN (Print)9783319914541
DOIs
Publication statusPublished - 1 Jan 2018
Event23rd International Conference on Database Systems for Advanced Applications, DASFAA 2018 - Gold Coast, Australia
Duration: 21 May 201824 May 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10829 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference23rd International Conference on Database Systems for Advanced Applications, DASFAA 2018
Country/TerritoryAustralia
CityGold Coast
Period21/05/1824/05/18

Keywords

  • Reconstruction
  • Set expansion
  • Truth-finding
  • Tuples

Fingerprint

Dive into the research topics of 'Tuple reconstruction'. Together they form a unique fingerprint.

Cite this