iNVS : Repurposing Diffusion Inpainters for Novel View Synthesis 
                
                    Accepted to SIGGRAPH Asia, 2023
                
            
        - 
                        
                            Yash Kant
                        
                        
University of Toronto - 
                        
                            Aliaksandr Siarohin
                        
                        
Snap Research - 
                        
                            Michael Vasilkovsky
                        
                        
Snap Research - 
                        
                            Riza Alp Guler
                        
                        
Snap Research - 
                        
                            Jian Ren
                        
                        
Snap Research - 
                        
                            Sergey Tulyakov
                        
                        
Snap Research - 
                        
                            Igor Gilitschenski
                        
                        
University of Toronto 
tl;dr
Citation
Overview
- We present a method for generating consistent novel views from a single source image, which focuses on maximizing the reuse of visible pixels from the source image.
 - We use a monocular depth estimator that transfers visible pixels from the source view to the target view, and then train a diffusion inpainter to fill in the missing pixels on Objaverse dataset.
 
Method: iNVS
Depth-based Splatting to create Partial Views (Left). We use ZoeDepth to unproject the source view into 3D, and apply depth-based Softmax Splatting to create a partial target view.
Training Inpainter to complete Partial Views (Right). While training on Objaverse use a inpainting masking bassd on epipolar lines, which allows our model to discover object boundaries better.
Qualitative Result: Baseline Comparisons
Compared to baselines, our method preserves sharp details (text and texture) much better.
                    
Qualitative Result: Ablation Study
We ablate our method on various design choices made to demonstrate their importance.
                    
Qualitative Result: Multiple Novel Views
We find that iNVS can generate consistent views across range of viewpoints given monocular depth estimator is accurate.
                
Qualitative Result: Failure Modes
We find that iNVS struggles most when monocular depth estimator generates inaccurate depth.
                
The website template was borrowed from Michaƫl Gharbi.