Animatable and Relightable Gaussians
for High-fidelity Human Avatar Modeling


Zhe Li*1, Yipengjing Sun*2, Zerong Zheng3, Lizhen Wang1, Shengping Zhang2, Yebin Liu1

* indicates equal contribution

1Tsinghua University    2Harbin Institue of Technology    3NNKosmos Technology

Abstract


Modeling animatable human avatars from RGB videos is a long-standing and challenging problem. Recent works usually adopt MLP-based neural radiance fields (NeRF) to represent 3D humans, but it remains difficult for pure MLPs to regress pose-dependent garment details. To this end, we introduce Animatable Gaussians, a new avatar representation that leverages powerful 2D CNNs and 3D Gaussian splatting to create high-fidelity avatars. To associate 3D Gaussians with the animatable avatar, we learn a parametric template from the input videos, and then parameterize the template on two front \& back canonical Gaussian maps where each pixel represents a 3D Gaussian. The learned template is adaptive to the wearing garments for modeling looser clothes like dresses. Such template-guided 2D parameterization enables us to employ a powerful StyleGAN-based CNN to learn the pose-dependent Gaussian maps for modeling detailed dynamic appearances. Furthermore, we introduce a pose projection strategy for better generalization given novel poses. To tackle the realistic relighting of animatable avatars, we introduce physically-based rendering into the avatar representation for decomposing avatar materials and environment illumination. Overall, our method can create lifelike avatars with dynamic, realistic, generalized and relightable appearances. Experiments show that our method outperforms other state-of-the-art approaches.


Animation and Relighting


Free view rendering with a fixed novel light
Fixed view rendering with a rotating novel light
Animation under challenging poses and novel lights

Method

 

Illustration of the avatar modeling pipeline. It contains two main steps: 1) Reconstruct a character-specific template from multi-view images. 2) Predict pose-dependent Gaussian and intrinsic maps through StyleUNets, and render the posed Gaussians by Gaussian splatting and physically-based rendering to learn both pose-dependent dynamics and avatar materials. Finally, given a novel environment light, we can animate the avatar with realistic dynamic appearances and shadow effects.

 


Demo Video


If the video does not play, please click here to watch it.

Citation



@article{li2024animatable,
  title={Animatable and Relightable Gaussians for High-fidelity Human Avatar Modeling},
  author={Li, Zhe and Sun, Yipengjing and Zheng, Zerong and Wang, Lizhen and Zhang, Shengping and Liu, Yebin},
  journal={arXiv preprint arXiv:2311.16096v4},
  year={2024},
}