3D Wikipedia: Using Online Text to Automatically Label and Navigate Reconstructed Geometry

Bryan C. Russell, Ricardo Martin-Brualla, Daniel J. Butler,
Steven M. Seitz, Luke Zettlemoyer

Overview

We introduce an approach for analyzing Wikipedia and other text, together with online photos, to produce annotated 3D models of famous tourist sites. The approach is completely automated, and leverages online text and photo co-occurrences via Google Image Search. It enables a number of new interactions, which we demonstrate in a new 3D visualization tool. Text can be selected to move the camera to the corresponding objects, 3D bounding boxes provide anchors back to the text describing them, and the overall narrative of the text provides a temporal guide for automatically flying through the scene to visualize the world as you read about it. We show compelling results on several major tourist sites.

Popular Press

Paper & Presentation

Bryan C. Russell, Ricardo Martin-Brualla, Daniel J. Butler, Steven M. Seitz,
and Luke Zettlemoyer.
3D Wikipedia: Using Online Text to Automatically Label and Navigate
Reconstructed Geometry,
ACM Transactions on Graphics (SIGGRAPH Asia 2013), Vol. 32, No. 6.
(PDF | BibTeX)

SIGGRAPH Asia talk slides (PPTX, 227MB)

Code and Data

The source code is available on our GitHub project page.

We also provide the input text, reference image, and a Matlab data structure containing the 3D sparse point cloud for the Pantheon in Rome (73MB), which is used in the demo script in the source code.

Funding Acknowledgements

The research was supported in part by the National Science Foundation (IIS-1250793), the Intel Science and Technology Centers for Visual Computing (ISTC-VC) and Pervasive Computing (ISTC-PC), and Google.