WebChild: Harvesting and organizing commonsense knowledge from the web

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This paper presents a method for automatically constructing a large commonsense knowledge base, called WebChild, from Web contents. WebChild contains triples that connect nouns with adjectives via fine-grained relations like hasShape, hasTaste, evokesEmotion, etc. The arguments of these assertions, nouns and adjectives, are disambiguated by mapping them onto their proper WordNet senses. Our method is based on semi-supervised Label Propagation over graphs of noisy candidate assertions. We automatically derive seeds from WordNet and by pattern matching from Web text collections. The Label Propagation algorithm provides us with domain sets and range sets for 19 different relations, and with confidence-ranked assertions between WordNet senses. Large-scale experiments demonstrate the high accuracy (more than 80 percent) and coverage (more than four million fine grained disambiguated assertions) of WebChild.

Original languageEnglish
Title of host publicationWSDM 2014 - Proceedings of the 7th ACM International Conference on Web Search and Data Mining
PublisherAssociation for Computing Machinery
Pages523-532
Number of pages10
ISBN (Print)9781450323512
DOIs
Publication statusPublished - 1 Jan 2014
Event7th ACM International Conference on Web Search and Data Mining, WSDM 2014 - New York, NY, United States
Duration: 24 Feb 201428 Feb 2014

Publication series

NameWSDM 2014 - Proceedings of the 7th ACM International Conference on Web Search and Data Mining

Conference

Conference7th ACM International Conference on Web Search and Data Mining, WSDM 2014
Country/TerritoryUnited States
CityNew York, NY
Period24/02/1428/02/14

Keywords

  • commonsense knowledge
  • knowledge bases
  • label propagation
  • web mining
  • word sense disambiguation

Fingerprint

Dive into the research topics of 'WebChild: Harvesting and organizing commonsense knowledge from the web'. Together they form a unique fingerprint.

Cite this