TY - GEN
T1 - RIVQ-VAE
T2 - 11th International Conference on 3D Vision, 3DV 2024
AU - Mezghanni, Mariem
AU - Boulkenafed, Malika
AU - Ovsjanikov, Maks
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024/1/1
Y1 - 2024/1/1
N2 - Building local surface representations has recently attracted significant attention in 3D vision, allowing to structure complex 3D shapes as sequences of simpler local geometries. Inspired by advances in 2D discrete representation learning, recent approaches have proposed to break up 3D shapes into regular grids, where each cell is associated with a discrete code sampled from a learnable codebook. Unfortunately, existing methods ignore both the local rigid self-similarities as well as the ambiguities inherent to 3D geometry related to possible changes in orientation. As a result, such techniques require very large codebooks to capture all possible variability in both geometry and pose. In this work, we propose a novel generative model that improves the generation quality by compactly embedding local geometries in a rotation- and translation-invariant manner. This strategy allows our codebook of discrete codes to express a larger range of geometric structures by avoiding local and global redundancies. Crucially, we demonstrate via a careful architecture design that our approach allows to recover meaningful shapes from local embeddings, while ensuring global consistency. The conducted experiments show that our approach outperforms baseline methods by a large margin under similar settings.
AB - Building local surface representations has recently attracted significant attention in 3D vision, allowing to structure complex 3D shapes as sequences of simpler local geometries. Inspired by advances in 2D discrete representation learning, recent approaches have proposed to break up 3D shapes into regular grids, where each cell is associated with a discrete code sampled from a learnable codebook. Unfortunately, existing methods ignore both the local rigid self-similarities as well as the ambiguities inherent to 3D geometry related to possible changes in orientation. As a result, such techniques require very large codebooks to capture all possible variability in both geometry and pose. In this work, we propose a novel generative model that improves the generation quality by compactly embedding local geometries in a rotation- and translation-invariant manner. This strategy allows our codebook of discrete codes to express a larger range of geometric structures by avoiding local and global redundancies. Crucially, we demonstrate via a careful architecture design that our approach allows to recover meaningful shapes from local embeddings, while ensuring global consistency. The conducted experiments show that our approach outperforms baseline methods by a large margin under similar settings.
KW - Generative modeling
KW - Local representation
KW - Point cloud completion
KW - Single-view reconstruction
UR - https://www.scopus.com/pages/publications/85196706887
U2 - 10.1109/3DV62453.2024.00129
DO - 10.1109/3DV62453.2024.00129
M3 - Conference contribution
AN - SCOPUS:85196706887
T3 - Proceedings - 2024 International Conference on 3D Vision, 3DV 2024
SP - 1382
EP - 1391
BT - Proceedings - 2024 International Conference on 3D Vision, 3DV 2024
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 18 March 2024 through 21 March 2024
ER -