TY - GEN
T1 - Progressive perceptual audio rendering of complex scenes
AU - Moeck, Thomas
AU - Bonneel, Nicolas
AU - Tsingos, Nicolas
AU - Drettakis, George
AU - Viaud-Delmon, Isabelle
AU - Alloza, David
PY - 2007/12/1
Y1 - 2007/12/1
N2 - Despite recent advances, including sound source clustering and perceptual auditory masking, high quality rendering of complex virtual scenes with thousands of sound sources remains a challenge. Two major bottlenecks appear as the scene complexity increases: the cost of clustering itself, and the cost of pre-mixing source signals within each cluster. In this paper, we first propose an improved hierarchical clustering algorithm that remains efficient for large numbers of sources and clusters while providing progressive refinement capabilities. We then present a lossy pre-mixing method based on a progressive representation of the input audio signals and the perceptual importance of each sound source. Our quality evaluation user tests indicate that the recently introduced audio saliency map is inappropriate for this task. Consequently we propose a "pinnacle", loudness-based metric, which gives the best results for a variety of target computing budgets. We also performed a perceptual pilot study which indicates that in audio-visual environments, it is better to allocate more clusters to visible sound sources. We propose a new clustering metric using this result. As a result of these three solutions, our system can provide high quality rendering of thousands of 3D-sound sources on a "gamer-style" PC.
AB - Despite recent advances, including sound source clustering and perceptual auditory masking, high quality rendering of complex virtual scenes with thousands of sound sources remains a challenge. Two major bottlenecks appear as the scene complexity increases: the cost of clustering itself, and the cost of pre-mixing source signals within each cluster. In this paper, we first propose an improved hierarchical clustering algorithm that remains efficient for large numbers of sources and clusters while providing progressive refinement capabilities. We then present a lossy pre-mixing method based on a progressive representation of the input audio signals and the perceptual importance of each sound source. Our quality evaluation user tests indicate that the recently introduced audio saliency map is inappropriate for this task. Consequently we propose a "pinnacle", loudness-based metric, which gives the best results for a variety of target computing budgets. We also performed a perceptual pilot study which indicates that in audio-visual environments, it is better to allocate more clusters to visible sound sources. We propose a new clustering metric using this result. As a result of these three solutions, our system can provide high quality rendering of thousands of 3D-sound sources on a "gamer-style" PC.
KW - Audio rendering
KW - Auditory masking
KW - Clustering
KW - Ventriloquism
U2 - 10.1145/1230100.1230133
DO - 10.1145/1230100.1230133
M3 - Conference contribution
AN - SCOPUS:77950585363
SN - 9781595936288
T3 - Proceedings - I3D 2007, ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games
SP - 189
EP - 196
BT - Proceedings - I3D 2007, ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games
T2 - ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games, I3D 2007
Y2 - 30 April 2007 through 2 May 2007
ER -