Invertible Neural Skinning

Invertible Neural Skinning
Accepted to CVPR, 2023

Yash Kant
University of Toronto
Aliaksandr Siarohin
Snap Research
Riza Alp Guler
Snap Research
Menglei Chai
Snap Research

Jian Ren
Snap Research
Sergey Tulyakov
Snap Research
Igor Gilitschenski
University of Toronto

tl;dr

Overview: Invertible Neural Skinning

Building animatable and editable models of clothed humans from raw 3D scans and poses is a challenging problem. Existing reposing methods suffer from following issues:

Limited expressiveness of Linear Blend Skinning (LBS).
Require costly mesh extraction to generate each new pose.
Typically do not preserve surface correspondences across different poses.

In this work, we introduce Invertible Neural Skinning (INS) to address all these shortcomings. The entire end-to-end pipeline is differentiable, and looks as follows:

PIN: Pose-conditioned Invertible Network

Invertible networks are bijective functions composed of modular components called coupling layers, which preserve 1-1 correspondences between their input and output. We propose a pose-aware coupling layer (RHS), which are chained together to construct a Pose-conditioned Invertible Network (PIN). To encode the body pose, we use a per-bone encoder (LHS) that takes into account the relative body pose between canonical and deformed space.

Qualitative Result: Pose-varying defomations by PIN

We visualize the deformations introduced by PINs in canonical space (before LBS) under varying target poses and demonstrate that INS is able to handle complex deformations of clothing across poses.

scales

Qualitative Result: Comparison with SNARF Baselines

We compare our method INS against all the five baselines discussed in Section 4.2 of the main paper. While both the LBS baselines, and SNARF-NC suffers from artifacts, we see that INS performs much better than other methods.

Qualitative Result: Ablation Study

We visualize results from various ablations reported in Section 4.4 of the main paper. Here, we find that removing SIREN leads to an overly smooth surface, and removing the LBS network makes it harder for the network to learn limb movements correctly.

Qualitative Result: Propogating Texture via Correspondences

INS can preserve correspondences across poses, and it is possible to propagate mesh attributes such as texture across various time frames. We apply texture to the pose-independent canonical mesh and propagate this texture through the INS network. We found that the applied texture deforms realistically like clothing, while being consistent across all frames, and was free of jittering.