Abstract
High-dimensional count data poses significant challenges for statistical analysis, necessitating effective methods that also preserve explainability. We focus on a low rank constrained variant of the Poisson log-normal model, which relates the observed data to a latent low-dimensional multivariate Gaussian variable via a Poisson distribution. Variational inference methods have become a golden standard solution to infer such a model. While computationally efficient, they usually lack theoretical statistical properties with respect to the model. To address this issue we propose a projected stochastic gradient scheme that directly maximizes the log-likelihood. We prove the convergence of the proposed method when using importance sampling for estimating the gradient. Specifically, we achieve a convergence rate of O(T-1/2 + N-1), where T denotes the number of iterations and N represents the number of Monte Carlo samples. The latter follows from a novel descent lemma for non convex L-smooth objective functions, and random biased gradient estimate. We also demonstrate numerically the efficiency of our solution compared to its variational competitor. Our method not only scales with respect to the number of observed samples but also provides access to the desirable properties of the maximum likelihood estimator.
| Original language | English |
|---|---|
| Pages (from-to) | 2199-2238 |
| Number of pages | 40 |
| Journal | Electronic Journal of Statistics |
| Volume | 19 |
| Issue number | 1 |
| DOIs | |
| Publication status | Published - 1 Jan 2025 |
| Externally published | Yes |
Keywords
- Dimension reduction
- Poisson log-normal model
- data
- importance sampling
- multivariate count
- projected stochastic gradient descent