Résumé
In this work, we address the task of unconditional head motion generation to animate still human faces in a low-dimensional semantic space from a single reference pose. Different from traditional audio-conditioned talking head generation that seldom puts emphasis on realistic head motions, we devise a GAN-based architecture that learns to synthesize rich head motion sequences over long duration while maintaining low error-accumulation levels. In particular, the autoregressive generation of incremental outputs ensures smooth trajectories, while a multi-scale discriminator on input pairs drives generation toward better handling of high- and low-frequency signals and less mode collapse. We experimentally demonstrate the relevance of the proposed method and show its superiority compared to models that attained state-of-the-art performances on similar tasks.
| langue originale | Anglais |
|---|---|
| Numéro d'article | 5154 |
| journal | ACM Transactions on Multimedia Computing, Communications and Applications |
| Volume | 21 |
| Numéro de publication | 1 |
| Les DOIs | |
| état | Publié - 16 déc. 2024 |
Empreinte digitale
Examiner les sujets de recherche de « Autoregressive GAN for Semantic Unconditional Head Motion Generation ». Ensemble, ils forment une empreinte digitale unique.Contient cette citation
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver