TY - JOUR
T1 - Self-Supervised Learning of Detailed 3D Face Reconstruction
AU - Chen, Yajing
AU - Wu, Fanzi
AU - Wang, Zeyu
AU - Song, Yibing
AU - Ling, Yonggen
AU - Bao, Linchao
PY - 2020/1/1
Y1 - 2020/1/1
N2 - In this article, we present an end-to-end learning framework for detailed 3D face reconstruction from a single image. Our approach uses a 3DMM-based coarse model and a displacement map in UV-space to represent a 3D face. Unlike previous work addressing the problem, our learning framework does not require supervision of surrogate ground-truth 3D models computed with traditional approaches. Instead, we utilize the input image itself as supervision during learning. In the first stage, we combine a photometric loss and a facial perceptual loss between the input face and the rendered face, to regress a 3DMM-based coarse model. In the second stage, both the input image and the regressed texture of the coarse model are unwrapped into UV-space, and then sent through an image-to-image translation network to predict a displacement map in UV-space. The displacement map and the coarse model are used to render a final detailed face, which again can be compared with the original input image to serve as a photometric loss for the second stage. The advantage of learning displacement map in UV-space is that face alignment can be explicitly done during the unwrapping, thus facial details are easier to learn from large amount of data. Extensive experiments demonstrate the superiority of our method over previous work.
AB - In this article, we present an end-to-end learning framework for detailed 3D face reconstruction from a single image. Our approach uses a 3DMM-based coarse model and a displacement map in UV-space to represent a 3D face. Unlike previous work addressing the problem, our learning framework does not require supervision of surrogate ground-truth 3D models computed with traditional approaches. Instead, we utilize the input image itself as supervision during learning. In the first stage, we combine a photometric loss and a facial perceptual loss between the input face and the rendered face, to regress a 3DMM-based coarse model. In the second stage, both the input image and the regressed texture of the coarse model are unwrapped into UV-space, and then sent through an image-to-image translation network to predict a displacement map in UV-space. The displacement map and the coarse model are used to render a final detailed face, which again can be compared with the original input image to serve as a photometric loss for the second stage. The advantage of learning displacement map in UV-space is that face alignment can be explicitly done during the unwrapping, thus facial details are easier to learn from large amount of data. Extensive experiments demonstrate the superiority of our method over previous work.
KW - Face
KW - Three-dimensional displays
KW - Solid modeling
KW - Image reconstruction
KW - Computational modeling
KW - Training
KW - Supervised learning
UR - https://ieeexplore.ieee.org/document/9178990/
UR - https://www.scopus.com/pages/publications/85091116323
U2 - 10.1109/TIP.2020.3017347
DO - 10.1109/TIP.2020.3017347
M3 - Article
SN - 1941-0042
VL - 29
SP - 8696
EP - 8705
JO - IEEE Transactions on Image Processing
JF - IEEE Transactions on Image Processing
M1 - 9178990
ER -