TY - GEN
T1 - The Deep Learning Revolution in MIR
T2 - 14th International Symposium on Perception, Representations, Image, Sound, Music, CMMR 2019
AU - Peeters, Geoffroy
N1 - Publisher Copyright:
© 2021, Springer Nature Switzerland AG.
PY - 2021/1/1
Y1 - 2021/1/1
N2 - This paper deals with the deep learning revolution in Music Information Research (MIR), i.e. the switch from knowledge-driven hand-crafted systems to data-driven deep-learning systems. To discuss the pro and cons of this revolution, we first review the basic elements of deep learning and explain how those can be used for audio feature learning or for solving difficult MIR tasks. We then discuss the case of hand-crafted features and demonstrate that, while those where indeed shallow and explainable at the start, they tended to be deep, data-driven and unexplainable over time, already before the reign of deep-learning. The development of these data-driven approaches was allowed by the increasing access to large annotated datasets. We therefore argue that these annotated datasets are today the central and most sustainable element of any MIR research. We propose new ways to obtain those at scale. Finally we highlight a set of challenges to be faced by the deep learning revolution in MIR, especially concerning the consideration of music specificities, the explainability of the models (X-AI) and their environmental cost (Green-AI).
AB - This paper deals with the deep learning revolution in Music Information Research (MIR), i.e. the switch from knowledge-driven hand-crafted systems to data-driven deep-learning systems. To discuss the pro and cons of this revolution, we first review the basic elements of deep learning and explain how those can be used for audio feature learning or for solving difficult MIR tasks. We then discuss the case of hand-crafted features and demonstrate that, while those where indeed shallow and explainable at the start, they tended to be deep, data-driven and unexplainable over time, already before the reign of deep-learning. The development of these data-driven approaches was allowed by the increasing access to large annotated datasets. We therefore argue that these annotated datasets are today the central and most sustainable element of any MIR research. We propose new ways to obtain those at scale. Finally we highlight a set of challenges to be faced by the deep learning revolution in MIR, especially concerning the consideration of music specificities, the explainability of the models (X-AI) and their environmental cost (Green-AI).
KW - Audio feature
KW - Deep-learning
KW - Machine-learning
KW - Music information retrieval
UR - https://www.scopus.com/pages/publications/85103466210
U2 - 10.1007/978-3-030-70210-6_1
DO - 10.1007/978-3-030-70210-6_1
M3 - Conference contribution
AN - SCOPUS:85103466210
SN - 9783030702090
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 3
EP - 30
BT - Perception, Representations, Image, Sound, Music - 14th International Symposium, CMMR 2019, Revised Selected Papers
A2 - Kronland-Martinet, Richard
A2 - Ystad, Sølvi
A2 - Aramaki, Mitsuko
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 14 October 2019 through 18 October 2019
ER -