Automated calibration of consensus weighted distance-based clustering approaches using sharp

  • Barbara Bodinier
  • , Dragana Vuckovic
  • , Sabrina Rodrigues
  • , Sarah Filippi
  • , Julien Chiquet
  • , Marc Chadeau-Hyam

Research output: Contribution to journalArticlepeer-review

Abstract

Motivation: In consensus clustering, a clustering algorithm is used in combination with a subsampling procedure to detect stable clusters. Previous studies on both simulated and real data suggest that consensus clustering outperforms native algorithms. Results: We extend here consensus clustering to allow for attribute weighting in the calculation of pairwise distances using existing regularized approaches. We propose a procedure for the calibration of the number of clusters (and regularization parameter) by maximizing the sharp score, a novel stability score calculated directly from consensus clustering outputs, making it extremely computationally competitive. Our simulation study shows better clustering performances of (i) approaches calibrated by maximizing the sharp score compared to existing calibration scores and (ii) weighted compared to unweighted approaches in the presence of features that do not contribute to cluster definition. Application on real gene expression data measured in lung tissue reveals clear clusters corresponding to different lung cancer subtypes.

Original languageEnglish
Article numberbtad635
JournalBioinformatics
Volume39
Issue number11
DOIs
Publication statusPublished - 1 Nov 2023
Externally publishedYes

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being

Fingerprint

Dive into the research topics of 'Automated calibration of consensus weighted distance-based clustering approaches using sharp'. Together they form a unique fingerprint.

Cite this