The Audio-Visual BatVision Dataset for Research on Sight and Sound

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Vision research showed remarkable success in understanding our world, propelled by datasets of images and videos. Sensor data from radar, LiDAR and cameras supports research in robotics and autonomous driving for at least a decade. However, while visual sensors may fail in some conditions, sound has recently shown potential to complement sensor data. Simulated room impulse responses (RIR) in 3D apartment-models became a benchmark dataset for the community, fostering a range of audiovisual research. In simulation, depth is predictable from sound, by learning bat-like perception with a neural network. Concurrently, the same was achieved in reality by using RGB-D images and echoes of chirping sounds. Biomimicking bat perception is an exciting new direction but needs dedicated datasets to explore the potential. Therefore, we collected the BatVision dataset to provide large-scale echoes in complex real-world scenes to the community. We equipped a robot with a speaker to emit chirps and a binaural microphone to record their echoes. Synchronized RGB-D images from the same perspective provide visual labels of traversed spaces. We sampled modern US office spaces to historic French university grounds, indoor and outdoor with large architectural variety. This dataset will allow research on robot echolocation, general audio-visual tasks and sound phænomena unavailable in simulated data. We show promising results for audio-only depth prediction and show how state-of-the-art work developed for simulated data can also succeed on our dataset. Project page: https://amandinebtto.github.io/Batvision-Dataset/

Original languageEnglish
Title of host publication2023 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages5812-5819
Number of pages8
ISBN (Electronic)9781665491907
DOIs
Publication statusPublished - 1 Jan 2023
Externally publishedYes
Event2023 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2023 - Detroit, United States
Duration: 1 Oct 20235 Oct 2023

Publication series

NameIEEE International Conference on Intelligent Robots and Systems
ISSN (Print)2153-0858
ISSN (Electronic)2153-0866

Conference

Conference2023 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2023
Country/TerritoryUnited States
CityDetroit
Period1/10/235/10/23

Fingerprint

Dive into the research topics of 'The Audio-Visual BatVision Dataset for Research on Sight and Sound'. Together they form a unique fingerprint.

Cite this