Conditional Single-view Shape Generation for Multi-view Stereo Reconstruction

Yi Wei*      Shaohui Liu*      Wang Zhao*      Jiwen Lu      Jie Zhou     

Tsinghua University

In this paper, we present a new perspective towards image-based shape generation. Most existing deep learning based shape reconstruction methods employ a single-view deterministic model which is sometimes insufficient to determine a single groundtruth shape because the back part is occluded. In this work, we first introduce a conditional generative network to model the uncertainty for single-view reconstruction. Then, we formulate the task of multi-view reconstruction as taking the intersection of the predicted shape spaces on each single image. We design new differentiable guidance including the front constraint, the diversity constraint, and the consistency loss to enable effective single-view conditional generation and multi-view synthesis. Experimental results and ablation studies show that our proposed approach outperforms state-of-the-art methods on 3D reconstruction test error and demonstrate its generalization ability on real world data.

Yi Wei*, Shaohui Liu*, Wang Zhao*, Jiwen Lu, and Jie Zhou
Conditional Single-view Shape Generation for Multi-view Stereo Reconstruction
CVPR 2019   [arXiv] [BibTeX]

The datasets we used include: ShapeNetCore and Stanford Online Products. The training data we used can be downloaded from the below link:
Related Papers
[1] Christopher B. Choy, Danfei Xu, JunYoung Gwak, Kevin Chen, Silvio Savarese. "3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction, in ECCV 2016.
[2] Hao Su*, Haoqiang Fan*, Leonidas Guibas. "A Point Set Generation Network for 3D Object Reconstruction from a Single Image", in CVPR 2017.
[3] Abhishek Kar, Christian Häne, Jitendra Malik. "Learning a Multi-view Stereo Machine", in NIPS 2017.
[4] Chen-Hsuan Lin, Chen Kong, Simon Lucey. " Learning Efficient Point Cloud Generation for Dense 3D Object Reconstruction", in AAAI 2018.
[5] Panos Achlioptas, Olga Diamanti, Ioannis Mitliagkas, Leonidas J. Guibas. "Learning Representations and Generative Models For 3D Point Clouds", in ICML 2018.
[6] Michael Bloesch, Jan Czarnowski, Ronald Clark, Stefan Leutenegger, Andrew J. Davison. "CodeSLAM - Learning a Compact, Optimisable Representation for Dense Visual SLAM", in CVPR 2018.
[7] Shubham Tulsiani, Alexei A. Efros, Jitendra Malik. "Multi-view Consistency as Supervisory Signal for Learning Shape and Pose Prediction", in CVPR 2018.


This work was supported in part by the National Natural Science Foundation of China under Grant U1813218, Grant 61822603, Grant U1713214, Grant 61672306, and Grant 61572271.