Invariance-based layer regularization for sound event detection

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Experimental and theoretical evidences suggest that invariance constraints can improve the performance and generalization capabilities of a classification model. While invariance-based regularization has become part of the standard tool-belt of machine learning practitioners, this regularization is usually applied near the decision layers or at the end of the feature extracting layers of a deep classification network. However, the optimal placement of invariance constraints inside a deep classifier is yet an open question. In particular, it would be beneficial to link it to the structural properties of the network (e.g. its architecture), or its dynamical properties (e.g. the effectively used volume of its latent spaces). The purpose of this article is to initiate an investigation on these aspects. We use the experimental framework of the DCASE 2023 Task 4A challenge, which considers the training of a sound event classifier in a semi-supervised manner. We show that the optimal placement of invariance constraints improves the performance of the standard baseline for this task.

Original languageEnglish
Title of host publication32nd European Signal Processing Conference, EUSIPCO 2024 - Proceedings
PublisherEuropean Signal Processing Conference, EUSIPCO
Pages51-55
Number of pages5
ISBN (Electronic)9789464593617
DOIs
Publication statusPublished - 1 Jan 2024
Event32nd European Signal Processing Conference, EUSIPCO 2024 - Lyon, France
Duration: 26 Aug 202430 Aug 2024

Publication series

NameEuropean Signal Processing Conference
ISSN (Print)2219-5491

Conference

Conference32nd European Signal Processing Conference, EUSIPCO 2024
Country/TerritoryFrance
CityLyon
Period26/08/2430/08/24

Keywords

  • DCASE task 4
  • invariance-based learning
  • semi-supervised learning

Fingerprint

Dive into the research topics of 'Invariance-based layer regularization for sound event detection'. Together they form a unique fingerprint.

Cite this