TY - GEN
T1 - Fusing Document, Collection and Label Graph-based Representations with Word Embeddings for Text Classification
AU - Skianis, Konstantinos
AU - Malliaros, Fragkiskos D.
AU - Vazirgiannis, Michalis
N1 - Publisher Copyright:
© 2018 Association for Computational Linguistics.
PY - 2018/1/1
Y1 - 2018/1/1
N2 - Contrary to the traditional Bag-of-Words approach, we consider the Graph-of-Words (GoW) model in which each document is represented by a graph that encodes relationships between the different terms. Based on this formulation, the importance of a term is determined by weighting the corresponding node in the document, collection and label graphs, using node centrality criteria. We also introduce novel graph-based weighting schemes by enriching graphs with word-embedding similarities, in order to reward or penalize semantic relationships. Our methods produce more discriminative feature weights for text categorization, outperforming existing frequency-based criteria. Code and data are available online.
AB - Contrary to the traditional Bag-of-Words approach, we consider the Graph-of-Words (GoW) model in which each document is represented by a graph that encodes relationships between the different terms. Based on this formulation, the importance of a term is determined by weighting the corresponding node in the document, collection and label graphs, using node centrality criteria. We also introduce novel graph-based weighting schemes by enriching graphs with word-embedding similarities, in order to reward or penalize semantic relationships. Our methods produce more discriminative feature weights for text categorization, outperforming existing frequency-based criteria. Code and data are available online.
M3 - Conference contribution
AN - SCOPUS:85096207330
T3 - NAACL HLT 2018 - Graph-Based Methods for Natural Language Processing, TextGraphs 2018 - Proceedings of the 12th Workshop
SP - 49
EP - 58
BT - NAACL HLT 2018 - Graph-Based Methods for Natural Language Processing, TextGraphs 2018 - Proceedings of the 12th Workshop
A2 - Glavas, Goran
A2 - Somasundaran, Swapna
A2 - Riedl, Martin
A2 - Hovy, Eduard
PB - Association for Computational Linguistics
T2 - 12th Workshop on Graph-Based Methods for Natural Language Processing, TextGraphs 2018 - in conjunction with the 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human, NAACL HLT 2018
Y2 - 6 June 2018
ER -