Abstract
Spectral clustering (Ng et al., 2001) and diffusion maps (Coifman and Lafon, 2006) are celebrated dimensionality reduction algorithms built on eigen-elements related to the diffusive structure of the data. The core of these procedures is the approximation of a Laplacian through a graph kernel approach (Hein et al., 2007), however this local average construction is known to be cursed by the high-dimension d. In this article, we build a different estimator of the Laplacian’s eigenvectors, via a reproducing kernel Hilbert space method, which adapts naturally to the regularity of the problem. We provide non-asymptotic statistical rates proving that the kernel estimator we build can circumvent the curse of dimensionality when the problem is well conditioned. Finally we discuss techniques (Nyström subsampling, Fourier features) that enable to reduce the computational cost of the estimator while not degrading its overall performance.
| Original language | English |
|---|---|
| Pages (from-to) | 5236-5259 |
| Number of pages | 24 |
| Journal | Proceedings of Machine Learning Research |
| Volume | 195 |
| Publication status | Published - 1 Jan 2023 |
| Externally published | Yes |
| Event | 36th Annual Conference on Learning Theory, COLT 2023 - Bangalore, India Duration: 12 Jul 2023 → 15 Jul 2023 |
Keywords
- diffusion maps
- dimensionality reduction
- graph Laplacians
- reproducing kernel Hilbert Spaces
- spectral clustering