Seeing the World in a Bag of Chips

CVPR 2020

Jeong Joon Park   Aleksander Holynski   Steve Seitz 
University of Washington

Abstract -- We address the dual problems of novel view synthesis and environment reconstruction from hand-held RGBD sensors. Our contributions include 1) modeling highly specular objects, 2) modeling inter-reflections and Fresnel effects, and 3) enabling surface light field reconstruction with the same input needed to reconstruct shape alone. In cases where scene surface has a strong mirror-like material component, we generate highly detailed environment images, revealing room composition, objects, people, buildings, and trees visible through windows. Our approach yields state of the art view synthesis techniques, operates on low dynamic range imagery, and is robust to geometric and calibration errors.


Supplementary Video

Oral Presentation

Citation: Park, Jeong Joon, Aleksander Holynski, and Steven M. Seitz. "Seeing the World in a Bag of Chips." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020.

Contact Author: Jeong Joon Park, jjpark7[at]cs[dot]washington[dot]edu

Acknowledgements: This work was partially or fully supported by funding from the UW Reality Lab.

Dataset: Below are links to the data used in the paper. Each zip file contains
Contains RGB images (640x480). For gamma correction (i.e. to linearize inensity) use gamma of 1.6: p^1.6, 0<=p<=1

Contains Depth images. The depth maps are stored as 640x480 16-bit monochrome images in PNG format. The depth images are scaled to a factor of 1000, i.e. 1000 is 1 meter. They are synchronized and registered with RGB images of the same numbering.

3x4 SE3 camera-to-world pose matrix of each input image. Order: tx ty tz r00 r01 r02 ...

Mesh of the 3D reconstruction of the scene. The floor is aligned to the XY plane.

Diffuse texture map used for rendering xy_mesh.ply

All dataset used the same camera with focal length 541.961 with principal point of (320 240)