Skip to main navigation Skip to search Skip to main content

Label noise (stochastic) gradient descent implicitly solves the Lasso for quadratic parametrisation

  • EPFL
  • École des Ponts

Research output: Contribution to journalConference articlepeer-review

Abstract

Understanding the implicit bias of training algorithms is of crucial importance in order to explain the success of overparametrised neural networks. In this paper, we study the role of the label noise in the training dynamics of a quadratically parametrised model through its continuous time version. We explicitly characterise the solution chosen by the stochastic flow and prove that it implicitly solves a Lasso program. To fully complete our analysis, we provide nonasymptotic convergence guarantees for the dynamics as well as conditions for support recovery. We also give experimental results which support our theoretical claims. Our findings highlight the fact that structured noise can induce better generalisation and help explain the greater performances of stochastic dynamics as observed in practice.

Original languageEnglish
Pages (from-to)2127-2159
Number of pages33
JournalProceedings of Machine Learning Research
Volume178
Publication statusPublished - 1 Jan 2022
Externally publishedYes
Event35th Conference on Learning Theory, COLT 2022 - Hybrid, London, United Kingdom
Duration: 2 Jul 20225 Jul 2022

Keywords

  • Label noise
  • Lasso
  • Sparse Regression
  • Stochastic dynamics

Fingerprint

Dive into the research topics of 'Label noise (stochastic) gradient descent implicitly solves the Lasso for quadratic parametrisation'. Together they form a unique fingerprint.

Cite this