Skip to main navigation Skip to search Skip to main content

Named entity recognition and identification for finding the owner of a home page

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Entity-based applications, such as expert search or online social networks where users search for persons, require high-quality datasets of named entity references. Obtaining such high-quality datasets can be achieved by automatically extracting metadata from Web pages. In this work, we focus on the identification of the named entity that corresponds to the owner of a particular Web page, for example, a home page or an organizational staff Web page. More specifically, from a set of named entities that have already been extracted from a Web page, we identify the one which corresponds to the owner of the home page. First, we develop a set of features which are combined in a scoring function to select the named entity of the Web page owner. Second, we formulate the problem as a classification problem in which a pair of a Web page and named entity is classified as being associated or not. We evaluate the proposed approaches on a set of Web pages in which we have previously identified named entities. Our experimental results show that we can identify the named entity corresponding to the owner of a home page with accuracy over 90%.

Original languageEnglish
Title of host publicationAdvances in Knowledge Discovery and Data Mining - 16th Pacific-Asia Conference, PAKDD 2012, Proceedings
Pages554-565
Number of pages12
EditionPART 1
DOIs
Publication statusPublished - 29 May 2012
Event16th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD 2012 - Kuala Lumpur, Malaysia
Duration: 29 May 20121 Jun 2012

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 1
Volume7301 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference16th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD 2012
Country/TerritoryMalaysia
CityKuala Lumpur
Period29/05/121/06/12

Keywords

  • entity selection
  • named entity recognition

Fingerprint

Dive into the research topics of 'Named entity recognition and identification for finding the owner of a home page'. Together they form a unique fingerprint.

Cite this