Résumé
This paper presents a new generalization error analysis for Decentralized Stochastic Gradient Descent (D-SGD) based on algorithmic stability. The obtained results overhaul a series of recent works that suggested an increased instability due to decentralization and a detrimental impact of poorly-connected communication graphs on generalization. On the contrary, we show, for convex, strongly convex and non-convex functions, that D-SGD can always recover generalization bounds analogous to those of classical SGD, suggesting that the choice of graph does not matter. We then argue that this result is coming from a worst-case analysis, and we provide a refined optimization-dependent generalization bound for general convex functions. This new bound reveals that the choice of graph can in fact improve the worst-case bound in certain regimes, and that surprisingly, a poorly-connected graph can even be beneficial for generalization.
| langue originale | Anglais |
|---|---|
| Pages (de - à) | 26215-26240 |
| Nombre de pages | 26 |
| journal | Proceedings of Machine Learning Research |
| Volume | 235 |
| état | Publié - 1 janv. 2024 |
| Evénement | 41st International Conference on Machine Learning, ICML 2024 - Vienna, Autriche Durée: 21 juil. 2024 → 27 juil. 2024 |
Empreinte digitale
Examiner les sujets de recherche de « Improved Stability and Generalization Guarantees of the Decentralized SGD Algorithm ». Ensemble, ils forment une empreinte digitale unique.Contient cette citation
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver