Skip to main navigation Skip to search Skip to main content

Word sense induction with agglomerative clustering and mutual information maximization

Research output: Contribution to journalArticlepeer-review

Abstract

Word sense induction (WSI) is a challenging problem in natural language processing that involves the unsupervised automatic detection of a word's senses (i.e., meanings). Recent work achieves significant results on the WSI task by pre-training a language model that can exclusively disambiguate word senses. In contrast, others employ off-the-shelf pre-trained language models with additional strategies to induce senses. This paper proposes a novel unsupervised method based on hierarchical clustering and invariant information clustering (IIC). The IIC loss is used to train a small model to optimize the mutual information between two vector representations of a target word occurring in a pair of synthetic paraphrases. This model is later used in inference mode to extract a higher-quality vector representation to be used in the hierarchical clustering. We evaluate our method on two WSI tasks and in two distinct clustering configurations (fixed and dynamic number of clusters). We empirically show that our approach is at least on par with the state-of-the-art baselines, outperforming them in several configurations. The code and data to reproduce this work are available to the public1.

Original languageEnglish
Pages (from-to)193-201
Number of pages9
JournalAI Open
Volume4
DOIs
Publication statusPublished - 1 Jan 2023
Externally publishedYes

Keywords

  • BERT
  • Clustering
  • Mutual information
  • Natural language processing
  • Transformer
  • Unsupervised machine learning
  • Word sense induction

Fingerprint

Dive into the research topics of 'Word sense induction with agglomerative clustering and mutual information maximization'. Together they form a unique fingerprint.

Cite this