TY - GEN
T1 - Deep Reinforcement Learning for Audio-Visual Gaze Control
AU - Lathuilière, Stéphane
AU - Massé, Benoit
AU - Mesejo, Pablo
AU - Horaud, Radu
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/12/27
Y1 - 2018/12/27
N2 - We address the problem of audio-visual gaze control in the specific context of human-robot interaction, namely how controlled robot motions are combined with visual and acoustic observations in order to direct the robot head towards targets of interest. The paper has the following contributions: (i) a novel audio-visual fusion framework that is well suited for controlling the gaze of a robotic head; (ii) a reinforcement learning (RL) formulation for the gaze control problem, using a reward function based on the available temporal sequence of camera and microphone observations; and (iii) several deep architectures that allow to experiment with early and late fusion of audio and visual data. We introduce a simulated environment that enables us to learn the proposed deep RL model without the need of spending hours of tedious interaction. By thoroughly experimenting on a publicly available dataset and on a real robot, we provide empirical evidence that our method achieves state-of-the-art performance.
AB - We address the problem of audio-visual gaze control in the specific context of human-robot interaction, namely how controlled robot motions are combined with visual and acoustic observations in order to direct the robot head towards targets of interest. The paper has the following contributions: (i) a novel audio-visual fusion framework that is well suited for controlling the gaze of a robotic head; (ii) a reinforcement learning (RL) formulation for the gaze control problem, using a reward function based on the available temporal sequence of camera and microphone observations; and (iii) several deep architectures that allow to experiment with early and late fusion of audio and visual data. We introduce a simulated environment that enables us to learn the proposed deep RL model without the need of spending hours of tedious interaction. By thoroughly experimenting on a publicly available dataset and on a real robot, we provide empirical evidence that our method achieves state-of-the-art performance.
UR - https://www.scopus.com/pages/publications/85063008630
U2 - 10.1109/IROS.2018.8594327
DO - 10.1109/IROS.2018.8594327
M3 - Conference contribution
AN - SCOPUS:85063008630
T3 - IEEE International Conference on Intelligent Robots and Systems
SP - 1555
EP - 1562
BT - 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2018
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2018
Y2 - 1 October 2018 through 5 October 2018
ER -