Abstract
We consider a reformulation of Reduced-Rank Regression (RRR) and Sparse Reduced-Rank Regression (SRRR) as a non-convex non-differentiable function of a single of the two matrices usually introduced to parametrize low-rank matrix learning problems. We study the behavior of proximal gradient algorithms for the minimization of the objective. In par-ticular, based on an analysis of the geometry of the problem, we establish that a proximal Polyak-Lojasiewicz inequality is satisfied in a neighborhood of the set of optima under a condition on the regularization parameter. We consequently derive linear convergence rates for the proximal gradient descent with line search and for related algorithms in a neighborhood of the optima. Our experiments show that our formulation leads to much faster learning algorithms for RRR and especially for SRRR.
| Original language | English |
|---|---|
| Pages (from-to) | 2415-2424 |
| Number of pages | 10 |
| Journal | Proceedings of Machine Learning Research |
| Volume | 89 |
| Publication status | Published - 1 Jan 2019 |
| Event | 22nd International Conference on Artificial Intelligence and Statistics, AISTATS 2019 - Naha, Japan Duration: 16 Apr 2019 → 18 Apr 2019 |