Harmonic and inharmonic Nonnegative Matrix Factorization for polyphonic pitch transcription

Emmanuel Vincent, Nancy Benin, Roland Badeau

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Polyphonic pitch transcription consists of estimating the onset time, duration and pitch of each note in a music signal. This task is difficult in general, due to the wide range of possible instruments. This issue has been studied using adaptive models such as Nonnegative Matrix Factorization (NMF), which describe the signal as a weighted sum of basis spectra. However basis spectra representing multiple pitches result in inaccurate transcription. To avoid this, we propose a family of constrained NMF models, where each basis spectrum is expressed as a weighted sum of narrowband spectra consisting of a few adjacent partials at harmonic or inharmonic frequencies. The model parameters are adapted via combined multiplicative and Newton updates. The proposed method is shown to outperform standard NMF on a database of piano excerpts.

Original languageEnglish
Title of host publication2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP
Pages109-112
Number of pages4
DOIs
Publication statusPublished - 16 Sept 2008
Externally publishedYes
Event2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP - Las Vegas, NV, United States
Duration: 31 Mar 20084 Apr 2008

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Conference

Conference2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP
Country/TerritoryUnited States
CityLas Vegas, NV
Period31/03/084/04/08

Keywords

  • Harmonicity
  • Inharmonicity
  • Nonnegative matrix factorization
  • Pitch transcription
  • Spectral smoothness

Fingerprint

Dive into the research topics of 'Harmonic and inharmonic Nonnegative Matrix Factorization for polyphonic pitch transcription'. Together they form a unique fingerprint.

Cite this