SRG3: Speech-driven Robot Gesture Generation with GAN

Chuang Yu, Adriana Tapus

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The human gestures occur spontaneously and usually they are aligned with speech, which leads to a natural and expressive interaction. Speech-driven gesture generation is important in order to enable a social robot to exhibit social cues and conduct a successful human-robot interaction. In this paper, the generation process involves mapping acoustic speech representation to the corresponding gestures for a humanoid robot. The paper proposes a new GAN (Generative Adversarial Network) architecture for speech to gesture generation. Instead of the fixed mapping from one speech to one gesture pattern, our end-to-end GAN structure can generate multiple mapped gestures patterns from one speech (with multiple noises) just like humans do. The generated gestures can be applied to social robots with arms. The evaluation result shows the effectiveness of our generative model for speech-driven robot gesture generation.

Original languageEnglish
Title of host publication16th IEEE International Conference on Control, Automation, Robotics and Vision, ICARCV 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages759-766
Number of pages8
ISBN (Electronic)9781728177090
DOIs
Publication statusPublished - 13 Dec 2020
Externally publishedYes
Event16th IEEE International Conference on Control, Automation, Robotics and Vision, ICARCV 2020 - Virtual, Shenzhen, China
Duration: 13 Dec 202015 Dec 2020

Publication series

Name16th IEEE International Conference on Control, Automation, Robotics and Vision, ICARCV 2020

Conference

Conference16th IEEE International Conference on Control, Automation, Robotics and Vision, ICARCV 2020
Country/TerritoryChina
CityVirtual, Shenzhen
Period13/12/2015/12/20

Fingerprint

Dive into the research topics of 'SRG3: Speech-driven Robot Gesture Generation with GAN'. Together they form a unique fingerprint.

Cite this