TY - GEN
T1 - Deep-rhythm for tempo estimation and rhythm pattern recognition
AU - Foroughmand, Hadrien
AU - Peeters, Geoffroy
N1 - Publisher Copyright:
© 2020 International Society for Music Information Retrieval. All rights reserved.
PY - 2019/1/1
Y1 - 2019/1/1
N2 - It has been shown that the harmonic series at the tempo frequency of the onset-strength-function of an audio signal accurately describes its rhythm pattern and can be used to perform tempo or rhythm pattern estimation. Recently, in the case of multi-pitch estimation, the depth of the input layer of a convolutional network has been used to represent the harmonic series of pitch candidates. We use a similar idea here to represent the harmonic series of tempo candidates. We propose the Harmonic-Constant-Q-Modulation which represents, using a 4D-tensors, the harmonic series of modulation frequencies (considered as tempo frequencies) in several acoustic frequency bands over time. This representation is used as input to a convolutional network which is trained to estimate tempo or rhythm pattern classes. Using a large number of datasets, we evaluate the performance of our approach and compare it with previous approaches. We show that it slightly increases Accuracy-1 for tempo estimation but not the average-mean-Recall for rhythm pattern recognition.
AB - It has been shown that the harmonic series at the tempo frequency of the onset-strength-function of an audio signal accurately describes its rhythm pattern and can be used to perform tempo or rhythm pattern estimation. Recently, in the case of multi-pitch estimation, the depth of the input layer of a convolutional network has been used to represent the harmonic series of pitch candidates. We use a similar idea here to represent the harmonic series of tempo candidates. We propose the Harmonic-Constant-Q-Modulation which represents, using a 4D-tensors, the harmonic series of modulation frequencies (considered as tempo frequencies) in several acoustic frequency bands over time. This representation is used as input to a convolutional network which is trained to estimate tempo or rhythm pattern classes. Using a large number of datasets, we evaluate the performance of our approach and compare it with previous approaches. We show that it slightly increases Accuracy-1 for tempo estimation but not the average-mean-Recall for rhythm pattern recognition.
M3 - Conference contribution
AN - SCOPUS:85087093890
T3 - Proceedings of the 20th International Society for Music Information Retrieval Conference, ISMIR 2019
SP - 636
EP - 643
BT - Proceedings of the 20th International Society for Music Information Retrieval Conference, ISMIR 2019
A2 - Flexer, Arthur
A2 - Peeters, Geoffroy
A2 - Urbano, Julian
A2 - Volk, Anja
PB - International Society for Music Information Retrieval
T2 - 20th International Society for Music Information Retrieval Conference, ISMIR 2019
Y2 - 4 November 2019 through 8 November 2019
ER -