Existing text-to-avatar methods are either limited to static avatars which cannot be animated or struggle to generate animatable avatars with promising quality and precise pose control. To address these, we propose AvatarStudio, a coarse-to-fine generative model that generates explicit textured 3D meshes for animatable human avatars. Specifically, AvatarStudio begins with a low-resolution NeRF-based representation for coarse generation, followed by incorporating SMPL-guided articulation into the explicit mesh representation to support avatar animation and high-resolution rendering. To ensure view consistency and pose controllability of the resulting avatars, we introduce a 2D diffusion model conditioned on DensePose for Score Distillation Sampling supervision. By effectively leveraging the synergy between the articulated mesh representation and the DensePose-conditional diffusion model, AvatarStudio can create high-quality avatars from text that are ready for animation, significantly outperforming previous methods. Moreover, it is competent for many applications, e.g., multimodal avatar animations and style-guided avatar creation.
AvatarStudio generates high-quality avatars in a multi-view consistent way.
AvatarStudio has shown promising results, effectively aligning the generated avatars with the detailed descriptions of the complex prompts.
We compare AvatarStudio with other text-guided generation methods.
DreamFusion
Magic3D-Fine
DreamAvatar
DreamWaltz
Ours
DreamFusion
Magic3D-Fine
DreamHuman
AvatarVerse
Ours
We conduct a comparative analysis of the avatar generation results that are achieved with fewer optimization steps. Left: the results obtained with reduced optimization steps (1 hour). Right: original results (2.5 hours). We see the model, even when optimized with fewer steps, can still yield results that are comparable to the original ones.
AvatarStudio provides high-quality and easy-to-use animation, allowing users to drive the generated avatars with multimodal signals, such as text or video.
AvatarStudio supports stylized avatar creation by simply providing an additional style image.
@article{zhang2023avatarstudio,
author = {Zhang, Jianfeng and Zhang, Xuanmeng and Zhang, Huichao and Liew, Jun Hao and Zhang, Chenxu and Yang, Yi and Feng, Jiashi},
title = {AvatarStudio: High-fidelity and Animatable 3D Avatar Creation from Text},
joural = {arXiv},
year = {2023},
}