Résumé
Supervised machine learning relies on the accessibility of large datasets of annotated data. This is essential since small datasets generally lead to overfitting when training high-dimensional machine-learning models. Since the manual annotation of such large datasets is a long, tedious and expensive process, another possibility is to artificially increase the size of the dataset. This is known as data augmentation. In this paper we provide an in-depth analysis of two data augmentation methods: sound transformations and sound segmentation. The first transforms a music track to a set of new music tracks by applying processes such as pitch-shifting, time-stretching or filtering. The second one splits a long sound signal into a set of shorter time segments. We study the effect of these two techniques (and the parameters of those) for a genre classification task using public datasets. The main contribution of this work is to detail by experimentation the benefit of these methods, used alone or together, during training and/or testing. We also demonstrate their use in improving the robustness of potentially unknown sound degradations. By analyzing these results, good practice recommendations are provided.
| langue originale | Anglais |
|---|---|
| Pages (de - à) | 97-110 |
| Nombre de pages | 14 |
| journal | Transactions of the International Society for Music Information Retrieval |
| Volume | 2 |
| Numéro de publication | 1 |
| Les DOIs | |
| état | Publié - 1 janv. 2019 |
Empreinte digitale
Examiner les sujets de recherche de « An Analysis of the Effect of Data Augmentation Methods: Experiments for a Musical Genre Classification Task ». Ensemble, ils forment une empreinte digitale unique.Contient cette citation
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver