Fusing Document, Collection and Label Graph-based Representations with Word Embeddings for Text Classification

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Contrary to the traditional Bag-of-Words approach, we consider the Graph-of-Words (GoW) model in which each document is represented by a graph that encodes relationships between the different terms. Based on this formulation, the importance of a term is determined by weighting the corresponding node in the document, collection and label graphs, using node centrality criteria. We also introduce novel graph-based weighting schemes by enriching graphs with word-embedding similarities, in order to reward or penalize semantic relationships. Our methods produce more discriminative feature weights for text categorization, outperforming existing frequency-based criteria. Code and data are available online.

Original languageEnglish
Title of host publicationNAACL HLT 2018 - Graph-Based Methods for Natural Language Processing, TextGraphs 2018 - Proceedings of the 12th Workshop
EditorsGoran Glavas, Swapna Somasundaran, Martin Riedl, Eduard Hovy
PublisherAssociation for Computational Linguistics
Pages49-58
Number of pages10
ISBN (Electronic)9781948087254
Publication statusPublished - 1 Jan 2018
Externally publishedYes
Event12th Workshop on Graph-Based Methods for Natural Language Processing, TextGraphs 2018 - in conjunction with the 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human, NAACL HLT 2018 - New Orleans, United States
Duration: 6 Jun 2018 → …

Publication series

NameNAACL HLT 2018 - Graph-Based Methods for Natural Language Processing, TextGraphs 2018 - Proceedings of the 12th Workshop

Conference

Conference12th Workshop on Graph-Based Methods for Natural Language Processing, TextGraphs 2018 - in conjunction with the 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human, NAACL HLT 2018
Country/TerritoryUnited States
CityNew Orleans
Period6/06/18 → …

Fingerprint

Dive into the research topics of 'Fusing Document, Collection and Label Graph-based Representations with Word Embeddings for Text Classification'. Together they form a unique fingerprint.

Cite this