A generic system for audio indexing: Application to speech/ music segmentation and music genre recognition

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In this paper we present a generic system for audio indexing (classification/segmentation) and apply it to two usual problems: speech/music segmentation and music genre recognition. We first present some requirements for the design of a generic system. The training part of it is based on a succession of four steps: feature extraction, feature selection, feature space transform and statistical modeling. We then propose several approaches for the indexing part depending of the local/ global characteristics of the indexes to be found. In particular we propose the use of segment-statistical models. The system is then applied to two usual problems. The first one is the speech/ music segmentation of a radio stream. The application is developed in a real industrial framework using real world categories and data. The performances obtained for the pure speech/ music classes problem are good. However when considering also the non-pure categories (mixed, bed) the performances of the system drop. The second problem is the music genre recognition. Since the indexes to be found are global, "segment-statistical models" are used leading to results close to the state of the art.

Original languageEnglish
Title of host publicationProceedings of the 10th International Conference on Digital Audio Effects, DAFx 2007
Pages205-212
Number of pages8
Publication statusPublished - 1 Jan 2007
Externally publishedYes
Event10th International Conference on Digital Audio Effects, DAFx 2007 - Bordeaux, France
Duration: 10 Sept 200715 Sept 2007

Publication series

NameProceedings of the International Conference on Digital Audio Effects, DAFx
ISSN (Print)2413-6700
ISSN (Electronic)2413-6689

Conference

Conference10th International Conference on Digital Audio Effects, DAFx 2007
Country/TerritoryFrance
CityBordeaux
Period10/09/0715/09/07

Fingerprint

Dive into the research topics of 'A generic system for audio indexing: Application to speech/ music segmentation and music genre recognition'. Together they form a unique fingerprint.

Cite this