Speech Emotion Recognition Using 1D/2D Convolutional Neural Networks

Pencea Maria Larisa, Ruxandra Tapu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Over the last few decades, emotion recognition has been a hot topic of research in the affective computing community. The automatic identification of emotions from raw speech signals is highly challenging and depends on multiple factors, including: the utterance length, the speaker language, gender or accent. In addition, the process is highly subjective because people can perceive emotions differently. In that regard, the goal of this paper is to evaluate some state-of-the-art deep convolutional neural networks (DCNNs) architectures receiving as input various ID/2D speech feature representations, conduct experiments on a publicly available dataset (Ryerson Audio-Visual Database of Emotional Speech and Song Dataset - RA VDESS) and identify which architecture has the best performance in the discrete emotion classification task.

Original languageEnglish
Title of host publication2022 15th International Symposium on Electronics and Telecommunications, ISETC 2022 - Conference Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781665451505
DOIs
Publication statusPublished - 1 Jan 2022
Externally publishedYes
Event15th International Symposium on Electronics and Telecommunications, ISETC 2022 - Timisoara, Romania
Duration: 10 Nov 202211 Nov 2022

Publication series

Name2022 15th International Symposium on Electronics and Telecommunications, ISETC 2022 - Conference Proceedings

Conference

Conference15th International Symposium on Electronics and Telecommunications, ISETC 2022
Country/TerritoryRomania
CityTimisoara
Period10/11/2211/11/22

Keywords

  • affective networks
  • deep convolutional neural networks
  • speech emotion recognition

Fingerprint

Dive into the research topics of 'Speech Emotion Recognition Using 1D/2D Convolutional Neural Networks'. Together they form a unique fingerprint.

Cite this