The Deep Learning Revolution in MIR: The Pros and Cons, the Needs and the Challenges

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This paper deals with the deep learning revolution in Music Information Research (MIR), i.e. the switch from knowledge-driven hand-crafted systems to data-driven deep-learning systems. To discuss the pro and cons of this revolution, we first review the basic elements of deep learning and explain how those can be used for audio feature learning or for solving difficult MIR tasks. We then discuss the case of hand-crafted features and demonstrate that, while those where indeed shallow and explainable at the start, they tended to be deep, data-driven and unexplainable over time, already before the reign of deep-learning. The development of these data-driven approaches was allowed by the increasing access to large annotated datasets. We therefore argue that these annotated datasets are today the central and most sustainable element of any MIR research. We propose new ways to obtain those at scale. Finally we highlight a set of challenges to be faced by the deep learning revolution in MIR, especially concerning the consideration of music specificities, the explainability of the models (X-AI) and their environmental cost (Green-AI).

Original languageEnglish
Title of host publicationPerception, Representations, Image, Sound, Music - 14th International Symposium, CMMR 2019, Revised Selected Papers
EditorsRichard Kronland-Martinet, Sølvi Ystad, Mitsuko Aramaki
PublisherSpringer Science and Business Media Deutschland GmbH
Pages3-30
Number of pages28
ISBN (Print)9783030702090
DOIs
Publication statusPublished - 1 Jan 2021
Externally publishedYes
Event14th International Symposium on Perception, Representations, Image, Sound, Music, CMMR 2019 - Marseille, France
Duration: 14 Oct 201918 Oct 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12631 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference14th International Symposium on Perception, Representations, Image, Sound, Music, CMMR 2019
Country/TerritoryFrance
CityMarseille
Period14/10/1918/10/19

Keywords

  • Audio feature
  • Deep-learning
  • Machine-learning
  • Music information retrieval

Fingerprint

Dive into the research topics of 'The Deep Learning Revolution in MIR: The Pros and Cons, the Needs and the Challenges'. Together they form a unique fingerprint.

Cite this