Averaging on the Bures-Wasserstein manifold: dimension-free convergence of gradient descent

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We study first-order optimization algorithms for computing the barycenter of Gaussian distributions with respect to the optimal transport metric. Although the objective is geodesically non-convex, Riemannian GD empirically converges rapidly, in fact faster than off-the-shelf methods such as Euclidean GD and SDP solvers. This stands in stark contrast to the best-known theoretical results for Riemannian GD, which depend exponentially on the dimension. In this work, we prove new geodesic convexity results on auxiliary functionals; this provides strong control of the Riemannian GD iterates, ultimately yielding a dimension-free convergence rate. Our techniques also enable the analysis of two related notions of averaging, the entropically-regularized barycenter and the geometric median, providing the first convergence guarantees for Riemannian GD for these problems.

Original languageEnglish
Title of host publicationAdvances in Neural Information Processing Systems 34 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021
EditorsMarc'Aurelio Ranzato, Alina Beygelzimer, Yann Dauphin, Percy S. Liang, Jenn Wortman Vaughan
PublisherNeural information processing systems foundation
Pages22132-22145
Number of pages14
ISBN (Electronic)9781713845393
Publication statusPublished - 1 Jan 2021
Externally publishedYes
Event35th Conference on Neural Information Processing Systems, NeurIPS 2021 - Virtual, Online
Duration: 6 Dec 202114 Dec 2021

Publication series

NameAdvances in Neural Information Processing Systems
Volume27
ISSN (Print)1049-5258

Conference

Conference35th Conference on Neural Information Processing Systems, NeurIPS 2021
CityVirtual, Online
Period6/12/2114/12/21

Fingerprint

Dive into the research topics of 'Averaging on the Bures-Wasserstein manifold: dimension-free convergence of gradient descent'. Together they form a unique fingerprint.

Cite this