XAGen can synthesize realistic 3D avatars with detailed geometry, while providing disentangled control over expressive attributes, i.e., facial expressions, jaw poses, body poses, and hand poses.
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Given different SMPL-X sequences with various hand, jaw, and body poses, XAGen can animate the generated 3D avatars accordingly.
brown hair, red T-shirt, blue jeans | blonde hair, pink T-shirt, black trousers |
We use the audio stream and corresponding SMPL-X sequence provided by an open-source audio-to-motion method TalkSHOW to animate our synthesized avatars (video with sound). Please refer to Section 4.3 for more details.
We show more audio-driven animation results for the avatars synthesized by XAGen with multi-view rendering. These videos contain audio.
@inproceedings{XAGen2023, title={XAGen: 3D Expressive Human Avatars Generation}, author={Xu, Zhongcong and Zhang, Jianfeng and Liew, Junhao and Feng, Jiashi and Shou, Mike Zheng}, booktitle={NeurIPS}, year={2023} }