TY - GEN
T1 - YAWN
T2 - Datenbanksysteme in Business, Technologie und Web, BTW 2007, 12. Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme", DBIS 2007 - Database Systems for Business, Technology and Web, BTW 2007, 12th Conference of the GI Division "Databases and Information Systems", DBIS 2007
AU - Schenkel, Ralf
AU - Suchanek, Fabian
AU - Kasneci, Gjergji
N1 - Publisher Copyright:
© 2007 Gesellschaft fur Informatik (GI). All rights reserved.
PY - 2007/1/1
Y1 - 2007/1/1
N2 - The paper presents YAWN, a system to convert the well-known and widely used Wikipedia collection into an XML corpus with semantically rich, self-explaining tags. We introduce algorithms to annotate pages and links with concepts from the WordNet thesaurus. This annotation process exploits categorical information in Wikipedia, which is a high-quality, manually assigned source of information, extracts additional information from lists, and utilizes the invocations of templates with named parameters. We give examples how such annotations can be exploited for high-precision queries.
AB - The paper presents YAWN, a system to convert the well-known and widely used Wikipedia collection into an XML corpus with semantically rich, self-explaining tags. We introduce algorithms to annotate pages and links with concepts from the WordNet thesaurus. This annotation process exploits categorical information in Wikipedia, which is a high-quality, manually assigned source of information, extracts additional information from lists, and utilizes the invocations of templates with named parameters. We give examples how such annotations can be exploited for high-precision queries.
UR - https://www.scopus.com/pages/publications/85135869663
M3 - Conference contribution
AN - SCOPUS:85135869663
T3 - Lecture Notes in Informatics (LNI), Proceedings - Series of the Gesellschaft fur Informatik (GI)
SP - 277
EP - 291
BT - Datenbanksysteme in Business, Technologie und Web, BTW 2007, 12. Fachtagung des GI-Fachbereichs "Datenbanken und Informationssysteme", DBIS 2007, Proceedings
A2 - Kemper, Alfons
A2 - Schoning, Harald
A2 - Rose, Thomas
A2 - Jarke, Matthias
A2 - Seidl, Thomas
A2 - Brochhaus, Christoph
PB - Gesellschaft fur Informatik (GI)
Y2 - 7 March 2007 through 9 March 2007
ER -