TY - JOUR
T1 - The Inverse Drum Machine
T2 - Source Separation Through Joint Transcription and Analysis-by-Synthesis
AU - Torres, Bernardo
AU - Peeters, Geoffroy
AU - Richard, Gael
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2026/1/1
Y1 - 2026/1/1
N2 - We present the Inverse Drum Machine, a novel approach to Drum Source Separation that leverages an analysis-by-synthesis framework combined with deep learning. Unlike recent supervised methods that require isolated stem recordings for training, our approach is trained on drum mixtures with only transcription annotations. IDM integrates Automatic Drum Transcription and One-shot Drum Sample Synthesis, jointly optimizing these tasks in an end-to-end manner. By convolving synthesized one-shot samples with estimated onsets, akin to a drum machine, we reconstruct the individual drum stems and train a Deep Neural Network on the reconstruction of the mixture. Experiments on the StemGMD dataset demonstrate that IDM achieves separation quality comparable to state-of-the-art supervised methods that require isolated stems data.
AB - We present the Inverse Drum Machine, a novel approach to Drum Source Separation that leverages an analysis-by-synthesis framework combined with deep learning. Unlike recent supervised methods that require isolated stem recordings for training, our approach is trained on drum mixtures with only transcription annotations. IDM integrates Automatic Drum Transcription and One-shot Drum Sample Synthesis, jointly optimizing these tasks in an end-to-end manner. By convolving synthesized one-shot samples with estimated onsets, akin to a drum machine, we reconstruct the individual drum stems and train a Deep Neural Network on the reconstruction of the mixture. Experiments on the StemGMD dataset demonstrate that IDM achieves separation quality comparable to state-of-the-art supervised methods that require isolated stems data.
KW - Audio source separation
KW - analysis-by-synthesis
KW - deep learning
KW - signal processing
UR - https://www.scopus.com/pages/publications/105020991359
U2 - 10.1109/TASLPRO.2025.3629286
DO - 10.1109/TASLPRO.2025.3629286
M3 - Article
AN - SCOPUS:105020991359
SN - 1558-7916
VL - 34
SP - 84
EP - 95
JO - IEEE Transactions on Audio, Speech and Language Processing
JF - IEEE Transactions on Audio, Speech and Language Processing
ER -