TY - JOUR
T1 - A Bayesian Fisher-EM algorithm for discriminative Gaussian subspace clustering
AU - Jouvin, Nicolas
AU - Bouveyron, Charles
AU - Latouche, Pierre
N1 - Publisher Copyright:
© 2021, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
PY - 2021/7/1
Y1 - 2021/7/1
N2 - High-dimensional data clustering has become and remains a challenging task for modern statistics and machine learning, with a wide range of applications. We consider in this work the powerful discriminative latent mixture model, and we extend it to the Bayesian framework. Modeling data as a mixture of Gaussians in a low-dimensional discriminative subspace, a Gaussian prior distribution is introduced over the latent group means and a family of twelve submodels are derived considering different covariance structures. Model inference is done with a variational EM algorithm, while the discriminative subspace is estimated via a Fisher-step maximizing an unsupervised Fisher criterion. An empirical Bayes procedure is proposed for the estimation of the prior hyper-parameters, and an integrated classification likelihood criterion is derived for selecting both the number of clusters and the submodel. The performances of the resulting Bayesian Fisher-EM algorithm are investigated in two thorough simulated scenarios, regarding both dimensionality as well as noise and assessing its superiority with respect to state-of-the-art Gaussian subspace clustering models. In addition to standard real data benchmarks, an application to single image denoising is proposed, displaying relevant results. This work comes with a reference implementation for the [InlineMediaObject not available: see fulltext.] software in the [InlineMediaObject not available: see fulltext.] package accompanying the paper and available on CRAN.
AB - High-dimensional data clustering has become and remains a challenging task for modern statistics and machine learning, with a wide range of applications. We consider in this work the powerful discriminative latent mixture model, and we extend it to the Bayesian framework. Modeling data as a mixture of Gaussians in a low-dimensional discriminative subspace, a Gaussian prior distribution is introduced over the latent group means and a family of twelve submodels are derived considering different covariance structures. Model inference is done with a variational EM algorithm, while the discriminative subspace is estimated via a Fisher-step maximizing an unsupervised Fisher criterion. An empirical Bayes procedure is proposed for the estimation of the prior hyper-parameters, and an integrated classification likelihood criterion is derived for selecting both the number of clusters and the submodel. The performances of the resulting Bayesian Fisher-EM algorithm are investigated in two thorough simulated scenarios, regarding both dimensionality as well as noise and assessing its superiority with respect to state-of-the-art Gaussian subspace clustering models. In addition to standard real data benchmarks, an application to single image denoising is proposed, displaying relevant results. This work comes with a reference implementation for the [InlineMediaObject not available: see fulltext.] software in the [InlineMediaObject not available: see fulltext.] package accompanying the paper and available on CRAN.
KW - Dimensionality reduction
KW - High dimensionality
KW - Linear discriminant analysis
KW - Mixture model
U2 - 10.1007/s11222-021-10018-6
DO - 10.1007/s11222-021-10018-6
M3 - Article
AN - SCOPUS:85106974618
SN - 0960-3174
VL - 31
JO - Statistics and Computing
JF - Statistics and Computing
IS - 4
M1 - 44
ER -