Singing voice detection with deep recurrent neural networks

Simon Leglaive, Romain Hennequin, Roland Badeau

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In this paper, we propose a new method for singing voice detection based on a Bidirectional Long Short-Term Memory (BLSTM) Recurrent Neural Network (RNN). This classifier is able to take a past and future temporal context into account to decide on the presence/absence of singing voice, thus using the inherent sequential aspect of a short-term feature extraction in a piece of music. The BLSTM-RNN contains several hidden layers, so it is able to extract a simple representation fitted to our task from low-level features. The results we obtain significantly outperform state-of-the-art methods on a common database.

Original languageEnglish
Title of host publication2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages121-125
Number of pages5
ISBN (Electronic)9781467369978
DOIs
Publication statusPublished - 4 Aug 2015
Event40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Brisbane, Australia
Duration: 19 Apr 201424 Apr 2014

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2015-August
ISSN (Print)1520-6149

Conference

Conference40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015
Country/TerritoryAustralia
CityBrisbane
Period19/04/1424/04/14

Keywords

  • Deep Learning
  • Long Short-Term Memory
  • Recurrent Neural Networks
  • Singing Voice Detection

Fingerprint

Dive into the research topics of 'Singing voice detection with deep recurrent neural networks'. Together they form a unique fingerprint.

Cite this