A conditional random field viewpoint of symbolic audio-to-score matching

Cyril Joder, Slim Essid, Gaël Richard

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We present a new approach of symbolic audio-to-score alignment, with the use of Conditional Random Fields (CRFs). Unlike Hidden Markov Models, these graphical models allow the calculation of state conditional probabilities to be made on the basis of several audio frames. The CRF models that we propose exploit this property to take into account the rhythmic information of the musical score. Assuming that the tempo is locally constant, they confront the neighborhood of each frame with several tempo hypotheses. Experiments on a pop-music database show that this use of contextual information leads to a significant improvement of the alignment accuracy. In particular, the proportion of detected onsets inside a 100-ms tolerance window increases by more than 10% when a 1-s neighborhood is considered.

Original languageEnglish
Title of host publicationMM'10 - Proceedings of the ACM Multimedia 2010 International Conference
Pages871-874
Number of pages4
DOIs
Publication statusPublished - 1 Dec 2010
Externally publishedYes
Event18th ACM International Conference on Multimedia ACM Multimedia 2010, MM'10 - Firenze, Italy
Duration: 25 Oct 201029 Oct 2010

Publication series

NameMM'10 - Proceedings of the ACM Multimedia 2010 International Conference

Conference

Conference18th ACM International Conference on Multimedia ACM Multimedia 2010, MM'10
Country/TerritoryItaly
CityFirenze
Period25/10/1029/10/10

Keywords

  • audio/score alignment
  • conditional random fields
  • indexing
  • music information retrieval

Fingerprint

Dive into the research topics of 'A conditional random field viewpoint of symbolic audio-to-score matching'. Together they form a unique fingerprint.

Cite this