Passer à la navigation principale Passer à la recherche Passer au contenu principal

Noise-free latent block model for high dimensional data

  • Laboratoire Hubert Curien UMR CNRS 5516
  • Laboratoire Jean Kuntzmann (LJK)

Résultats de recherche: Contribution à un journalArticleRevue par des pairs

Résumé

Co-clustering is known to be a very powerful and efficient approach in unsupervised learning because of its ability to partition data based on both the observations and the variables of a given dataset. However, in high-dimensional context co-clustering methods may fail to provide a meaningful result due to the presence of noisy and/or irrelevant features. In this paper, we tackle this issue by proposing a novel co-clustering model which assumes the existence of a noise cluster, that contains all irrelevant features. A variational expectation-maximization-based algorithm is derived for this task, where the automatic variable selection as well as the joint clustering of objects and variables are achieved via a Bayesian framework. Experimental results on synthetic datasets show the efficiency of our model in the context of high-dimensional noisy data. Finally, we highlight the interest of the approach on two real datasets which goal is to study genetic diversity across the world.

langue originaleAnglais
Pages (de - à)446-473
Nombre de pages28
journalData Mining and Knowledge Discovery
Volume33
Numéro de publication2
Les DOIs
étatPublié - 15 mars 2019
Modification externeOui

Empreinte digitale

Examiner les sujets de recherche de « Noise-free latent block model for high dimensional data ». Ensemble, ils forment une empreinte digitale unique.

Contient cette citation