On the correlation of automatic audio and visual segmentations of music videos

Olivier Gillet, Slim Essid, Gael Richard

Research output: Contribution to journalArticlepeer-review

Abstract

The study of the associations between audio and video content has numerous important applications in the fields of information retrieval and multimedia content authoring. In this work, we focus on music videos which exhibit a broad range of structural and semantic relationships between the music and the video content. To identify such relationships, a two-level automatic structuring of the music and the video is achieved separately. Note onsets are detected from the music signal, along with section changes. The latter is achieved by a novel algorithm which makes use of feature selection and statistical novelty detection approaches based on kernel methods. The video stream is independently segmented to detect changes in motion activity, as well as shot boundaries. Based on this two-level segmentation of both streams, four audio-visual correlation measures are computed. The usefulness of these correlation measures is illustrated by a query by video experiment on a 100 music video database, which also exhibits interesting genre dependencies.

Original languageEnglish
Pages (from-to)347-355
Number of pages9
JournalIEEE Transactions on Circuits and Systems for Video Technology
Volume17
Issue number3
DOIs
Publication statusPublished - 1 Mar 2007
Externally publishedYes

Keywords

  • Audio segmentation
  • Cross-modal queries
  • Information retrieval
  • Multimedia indexing
  • Multimodal processing
  • Music videos
  • Novelty detection

Fingerprint

Dive into the research topics of 'On the correlation of automatic audio and visual segmentations of music videos'. Together they form a unique fingerprint.

Cite this