Multi-View Stereo for Community Photo Collections
Michael Goesele, Noah Snavely, Brian Curless,
Hugues Hoppe, Steven M. Seitz
Abstract
We present a multi-view stereo algorithm that addresses the extreme
changes in lighting, scale, clutter, and other effects in large
online community photo collections. Our idea is to intelligently
choose images to match, both at a per-view and per-pixel level. We
show that such adaptive view selection enables robust performance
even with dramatic appearance variability. The stereo matching
technique takes as input sparse 3D points reconstructed from
structure-from-motion methods and iteratively grows surfaces from
these points. Optimizing for surface normals within a
photoconsistency measure significantly improves the matching
results. While the focus of our approach is to estimate
high-quality depth maps, we also show examples of merging the
resulting depth maps into compelling scene reconstructions. We
demonstrate our algorithm on standard multi-view stereo datasets and
on casually acquired photo collections of famous scenes gathered
from the Internet.
Publication
Multi-View Stereo for Community Photo Collections
Michael Goesele, Noah Snavely, Brian Curless, Hugues Hoppe, Steven M. Seitz
Proceedings of ICCV 2007, Rio de Janeiro, Brasil, October 14-20, 2007.
Team
Overview Talk
The Google Tech Talk "Navigating the World's Photographs" by Steve Seitz, Noah Snavely, and Michael Goesele gives a good overview over our current work on community photo collections (including multi-view stereo reconstruction). View a video of the talk on Google Video or download the video in Flash Video (FLV) format (114 MB) or AVI format (141 MB).
In the Press ...
NewScientist.com ran a story on this work on October 29th, 2007. Have a look at the article "Holiday snapshots used to model the world in 3D" by Will Knight that explains the basic ideas behind the paper.
Datasets
The following datasets we reconstructed using the proposed multi-view stereo technique. Models were trimmed using standard mesh processing operations to remove spurious geometry introduced by the Poisson reconstuction approach.
rendered model
|
example input image
|
Venus de Milo, Paris, France
reconstruction based on 129 images from Flickr
large image of the rendered model
|
rendered model
|
example input image
|
Duomo in Pisa, Italy
reconstruction based on 56 images from Flickr captured by 8 photographers
large image of the rendered model
|
rendered model
|
example input image
|
Notre Dame de Paris, France
reconstruction based on 653 images from Flickr captured by 313 photographers
large image of the rendered model
The result movie below shows a reconstruction of the central portal based on the same dataset. |
rendered models |
example input images |
temple and dino model from the multi-view stereo evaluation page
templeFull reconstruction based on 312 images from the test set
(0.42 mm accuracy, 98.2% completeness)
dinoFill reconstruction based on 363 images from the test set
(0.46 mm accuracy, 96.7% completeness)
More information is available at the multi-view stereo evaluation page.
|
Result Movie
The following movie shows a reconstruction of the central portal of Notre Dame cathedral in Paris. The reconstruction is based on 653 images from Flickr.
Acknowledgements
We would like to thank all photographers who made their images available via Flickr.
More about Community Photo Collections ...
... can be found at the Community Photo Collections project page.