Hard and fuzzy diagonal co-clustering for document-term partitioning

Charlotte Laclau, Mohamed Nadif

Research output: Contribution to journalArticlepeer-review

Abstract

We propose a hard and a fuzzy diagonal co-clustering algorithms built upon the double K-means to address the problem of document-term co-clustering. At each iteration, the proposed algorithms seek a diagonal block structure of the data by minimizing a criterion based on both the variance within the class and the centroid effect. In addition to be easy-to-interpret and effective on sparse binary and continuous data, the proposed algorithms, Hard Diagonal Double K-means (DDKM) and Fuzzy Diagonal Double K-means (F-DDKM), are also faster than other state-of-the-art clustering algorithms. We evaluate our contribution using synthetic data sets, and real data sets commonly used in document clustering.

Original languageEnglish
Pages (from-to)133-147
Number of pages15
JournalNeurocomputing
Volume193
DOIs
Publication statusPublished - 12 Jun 2016
Externally publishedYes

Keywords

  • Co-clustering
  • Document clustering
  • Fuzzy co-clustering

Fingerprint

Dive into the research topics of 'Hard and fuzzy diagonal co-clustering for document-term partitioning'. Together they form a unique fingerprint.

Cite this