Dodging the Double Descent in Deep Neural Networks

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Finding the optimal size of deep learning models is very actual and of broad impact, especially in energy-saving schemes. Very recently, an unexpected phenomenon, the "double descent", has caught the attention of the deep learning community. As the model's size grows, the performance gets first worse and then goes back to improving. It raises serious questions about the optimal model's size to maintain high generalization: the model needs to be sufficiently over-parametrized, but adding too many parameters wastes training resources. Is it possible to find, in an efficient way, the best trade-off?Our work shows that the double descent phenomenon is potentially avoidable with proper conditioning of the learning problem, but a final answer is yet to be found. We empirically observe that there is hope to dodge the double descent in complex scenarios with proper regularization, as a simple ℓ2 regularization is already positively contributing to such a perspective.

Original languageEnglish
Title of host publication2023 IEEE International Conference on Image Processing, ICIP 2023 - Proceedings
PublisherIEEE Computer Society
Pages1625-1629
Number of pages5
ISBN (Electronic)9781728198354
DOIs
Publication statusPublished - 1 Jan 2023
Event30th IEEE International Conference on Image Processing, ICIP 2023 - Kuala Lumpur, Malaysia
Duration: 8 Oct 202311 Oct 2023

Publication series

NameProceedings - International Conference on Image Processing, ICIP
ISSN (Print)1522-4880

Conference

Conference30th IEEE International Conference on Image Processing, ICIP 2023
Country/TerritoryMalaysia
CityKuala Lumpur
Period8/10/2311/10/23

Keywords

  • Double descent
  • deep learning
  • pruning
  • regularization

Fingerprint

Dive into the research topics of 'Dodging the Double Descent in Deep Neural Networks'. Together they form a unique fingerprint.

Cite this