The Inverse Drum Machine: Source Separation Through Joint Transcription and Analysis-by-Synthesis

Research output: Contribution to journalArticlepeer-review

Abstract

We present the Inverse Drum Machine, a novel approach to Drum Source Separation that leverages an analysis-by-synthesis framework combined with deep learning. Unlike recent supervised methods that require isolated stem recordings for training, our approach is trained on drum mixtures with only transcription annotations. IDM integrates Automatic Drum Transcription and One-shot Drum Sample Synthesis, jointly optimizing these tasks in an end-to-end manner. By convolving synthesized one-shot samples with estimated onsets, akin to a drum machine, we reconstruct the individual drum stems and train a Deep Neural Network on the reconstruction of the mixture. Experiments on the StemGMD dataset demonstrate that IDM achieves separation quality comparable to state-of-the-art supervised methods that require isolated stems data.

Original languageEnglish
Pages (from-to)84-95
Number of pages12
JournalIEEE Transactions on Audio, Speech and Language Processing
Volume34
DOIs
Publication statusPublished - 1 Jan 2026

Keywords

  • Audio source separation
  • analysis-by-synthesis
  • deep learning
  • signal processing

Fingerprint

Dive into the research topics of 'The Inverse Drum Machine: Source Separation Through Joint Transcription and Analysis-by-Synthesis'. Together they form a unique fingerprint.

Cite this