Statistically Optimal Generative Modeling with Maximum Deviation from the Empirical Distribution

  • Elen Vardanyan
  • , Sona Hunanyan
  • , Tigran Galstyan
  • , Arshak Minasyan
  • , Arnak Dalalyan

Research output: Contribution to journalConference articlepeer-review

Abstract

This paper explores the problem of generative modeling, aiming to simulate diverse examples from an unknown distribution based on observed examples. While recent studies have focused on quantifying the statistical precision of popular algorithms, there is a lack of mathematical evaluation regarding the non-replication of observed examples and the creativity of the generative model. We present theoretical insights into this aspect, demonstrating that the Wasserstein GAN, constrained to left-invertible push-forward maps, generates distributions that not only avoid replication but also significantly deviate from the empirical distribution. Importantly, we show that left-invertibility achieves this without compromising the statistical optimality of the resulting generator. Our most important contribution provides a finite-sample lower bound on the Wasserstein-1 distance between the generative distribution and the empirical one. We also establish a finite-sample upper bound on the distance between the generative distribution and the true data-generating one. Both bounds are explicit and show the impact of key parameters such as sample size, dimensions of the ambient and latent spaces, noise level, and smoothness measured by the Lipschitz constant.

Original languageEnglish
Pages (from-to)49203-49225
Number of pages23
JournalProceedings of Machine Learning Research
Volume235
Publication statusPublished - 1 Jan 2024
Event41st International Conference on Machine Learning, ICML 2024 - Vienna, Austria
Duration: 21 Jul 202427 Jul 2024

Fingerprint

Dive into the research topics of 'Statistically Optimal Generative Modeling with Maximum Deviation from the Empirical Distribution'. Together they form a unique fingerprint.

Cite this