Boosting Tricks for Word Mover’s Distance

Konstantinos Skianis, Fragkiskos D. Malliaros, Nikolaos Tziortziotis, Michalis Vazirgiannis

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Word embeddings have opened a new path in creating novel approaches for addressing traditional problems in the natural language processing (NLP) domain. However, using word embeddings to compare text documents remains a relatively unexplored topic—with Word Mover’s Distance (WMD) being the prominent tool used so far. In this paper, we present a variety of tools that can further improve the computation of distances between documents based on WMD. We demonstrate that, alternative stopwords, cross document-topic comparison, deep contextualized word vectors and convex metric learning, constitute powerful tools that can boost WMD.

Original languageEnglish
Title of host publicationArtificial Neural Networks and Machine Learning – ICANN 2020 - 29th International Conference on Artificial Neural Networks, Proceedings
EditorsIgor Farkaš, Paolo Masulli, Stefan Wermter
PublisherSpringer Science and Business Media Deutschland GmbH
Pages761-772
Number of pages12
ISBN (Print)9783030616151
DOIs
Publication statusPublished - 1 Jan 2020
Event29th International Conference on Artificial Neural Networks, ICANN 2020 - Bratislava, Slovakia
Duration: 15 Sept 202018 Sept 2020

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12397 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference29th International Conference on Artificial Neural Networks, ICANN 2020
Country/TerritorySlovakia
CityBratislava
Period15/09/2018/09/20

Keywords

  • Text classification
  • Word embeddings
  • Word mover’s distance

Fingerprint

Dive into the research topics of 'Boosting Tricks for Word Mover’s Distance'. Together they form a unique fingerprint.

Cite this