TY - GEN
T1 - Soft Disentanglement in Frequency Bands for Neural Audio Codecs
AU - Giniès, Benoît
AU - Bie, Xiaoyu
AU - Fercoq, Olivier
AU - Richard, Gaël
N1 - Publisher Copyright:
© 2025 European Signal Processing Conference, EUSIPCO. All rights reserved.
PY - 2025/1/1
Y1 - 2025/1/1
N2 - In neural-based audio feature extraction, ensuring that representations capture disentangled information is crucial for model interpretability. However, existing disentanglement methods often rely on assumptions that are highly dependent on data characteristics or specific tasks. In this work, we introduce a generalizable approach for learning disentangled features within a neural architecture. Our method applies spectral decomposition to time-domain signals, followed by a multi-branch audio codec that operates on the decomposed components. Empirical evaluations demonstrate that our approach achieves better reconstruction and perceptual performance compared to a state-of-the-art baseline while also offering potential advantages for inpainting tasks.
AB - In neural-based audio feature extraction, ensuring that representations capture disentangled information is crucial for model interpretability. However, existing disentanglement methods often rely on assumptions that are highly dependent on data characteristics or specific tasks. In this work, we introduce a generalizable approach for learning disentangled features within a neural architecture. Our method applies spectral decomposition to time-domain signals, followed by a multi-branch audio codec that operates on the decomposed components. Empirical evaluations demonstrate that our approach achieves better reconstruction and perceptual performance compared to a state-of-the-art baseline while also offering potential advantages for inpainting tasks.
KW - Disentanglement
KW - Frequency Decomposition
KW - Inpainting
KW - Neural Audio Codec
UR - https://www.scopus.com/pages/publications/105029853787
U2 - 10.23919/EUSIPCO63237.2025.11226244
DO - 10.23919/EUSIPCO63237.2025.11226244
M3 - Conference contribution
AN - SCOPUS:105029853787
T3 - European Signal Processing Conference
SP - 11
EP - 15
BT - 2025 33rd European Signal Processing Conference, EUSIPCO 2025 - Proceedings
PB - European Signal Processing Conference, EUSIPCO
T2 - 33rd European Signal Processing Conference, EUSIPCO 2025
Y2 - 8 September 2025 through 12 September 2025
ER -