An Analysis of the Effect of Data Augmentation Methods: Experiments for a Musical Genre Classification Task

Research output: Contribution to journalArticlepeer-review

Abstract

Supervised machine learning relies on the accessibility of large datasets of annotated data. This is essential since small datasets generally lead to overfitting when training high-dimensional machine-learning models. Since the manual annotation of such large datasets is a long, tedious and expensive process, another possibility is to artificially increase the size of the dataset. This is known as data augmentation. In this paper we provide an in-depth analysis of two data augmentation methods: sound transformations and sound segmentation. The first transforms a music track to a set of new music tracks by applying processes such as pitch-shifting, time-stretching or filtering. The second one splits a long sound signal into a set of shorter time segments. We study the effect of these two techniques (and the parameters of those) for a genre classification task using public datasets. The main contribution of this work is to detail by experimentation the benefit of these methods, used alone or together, during training and/or testing. We also demonstrate their use in improving the robustness of potentially unknown sound degradations. By analyzing these results, good practice recommendations are provided.

Original languageEnglish
Pages (from-to)97-110
Number of pages14
JournalTransactions of the International Society for Music Information Retrieval
Volume2
Issue number1
DOIs
Publication statusPublished - 1 Jan 2019

Keywords

  • Data Augmentation
  • Datasets
  • Musical genre classification
  • Supervised training

Fingerprint

Dive into the research topics of 'An Analysis of the Effect of Data Augmentation Methods: Experiments for a Musical Genre Classification Task'. Together they form a unique fingerprint.

Cite this