Abstract
Although Sinkhorn divergences are now routinely used in data sciences to compare probability distributions, the computational effort required to compute them remains expensive, growing in general quadratically in the size n of the support of these distributions. Indeed, solving optimal transport (OT) with an entropic regularization requires computing a n × n kernel matrix (the neg-exponential of a n × n pairwise ground cost matrix) that is repeatedly applied to a vector. We propose to use instead ground costs of the form c(x, y) = - logh'(x), '(y)i where ' is a map from the ground space onto the positive orthant Rr+, with r « n. This choice yields, equivalently, a kernel k(x, y) = h'(x), '(y)i, and ensures that the cost of Sinkhorn iterations scales as O(nr). We show that usual cost functions can be approximated using this form. Additionaly, we take advantage of the fact that our approach yields approximation that remain fully differentiable with respect to input distributions, as opposed to previously proposed adaptive low-rank approximations of the kernel matrix, to train a faster variant of OT-GAN [49].
| Original language | English |
|---|---|
| Journal | Advances in Neural Information Processing Systems |
| Volume | 2020-December |
| Publication status | Published - 1 Jan 2020 |
| Externally published | Yes |
| Event | 34th Conference on Neural Information Processing Systems, NeurIPS 2020 - Virtual, Online Duration: 6 Dec 2020 → 12 Dec 2020 |
Fingerprint
Dive into the research topics of 'Linear time sinkhorn divergences using positive features'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver