Optimal Gait and Form for Animal Locomotion
- Abstract:
- We present a fully automatic method for generating gaits and morphologies for legged animal locomotion. Given a specific animal's shape we can determine an efficient gait with which it can move. Similarly, we can also adapt the animalās morphology to be optimal for a specific locomotion task. We show that determining such gaits is possible without the need to specify a good initial motion, and without manually restricting the allowed gaits of each animal. Our approach is based on a hybrid optimization method which combines an efficient derivative-aware spacetime constraints optimization with a derivative-free approach able to find non-local solutions in high-dimensional discontinuous spaces. We demonstrate the effectiveness of this approach by synthesizing dynamic locomotions of bipeds, a quadruped, and an imaginary five-legged creature.
- Citation:
- Wampler, K. and Popović, Z..Optimal Gait and Form for Animal Locomotion. ACM Transactions on Graphics 28(3), August 2009 (Proceedings of SIGGRAPH 2009).
- On-line documents:
- Complete article (PDF)
Project
Contact-aware Nonlinear Control of Dynamic Characters
- Abstract:
- Dynamically simulated characters are difficult to control because they are underactuated - they have no direct control over their global position and orientation. In order to succeed, control policies must look ahead to determine stabilizing actions, but such planning is complicated by frequent ground contacts that produce a discontinuous search space. This paper introduces a locomotion system that generates high-quality animation of agile movements using nonlinear controllers that plan through such contact changes. We demonstrate the general applicability of this approach by emulating walking and running motions in rigid-body simulations. Then we consolidate these controllers under a higher-level planner that interactively controls the character's direction.
- Citation:
- Muico, U., Lee, Y., Popović, J. and Popović, Z. Contact-aware Nonlinear Control of Dynamic Characters. ACM Transactions on Graphics 28(3), August 2009 (Proceedings of SIGGRAPH 2009).
- On-line documents:
- Complete article (PDF)
Project
Dense 3D Motion Capture for Human Faces
- Abstract:
- This paper proposes a novel approach to motion capture from multiple, synchronized video streams, specifically aimed at recording dense and accurate models of the structure and motion of highly deformable surfaces such as skin, that stretches, shrinks, and shears in the midst of normal facial expressions. Solving this problem is a key step toward effective performance capture for the entertainment industry, but progress so far has been hampered by the lack of appropriate local motion and smoothness models. The main technical contribution of this paper is a novel approach to regularization adapted to nonrigid tangential deformations. Concretely, we estimate the nonrigid deformation parameters at each vertex of a surface mesh, smooth them over a local neighborhood for robustness, and use them to regularize the tangential motion estimation. To demonstrate the power of the proposed approach, we have integrated it into our previous work for markerless motion capture [9], and compared the performances of the original and new algorithms on three extremely challenging face datasets that include highly nonrigid skin deformations, wrinkles, and quickly changing expressions. Additional experiments with a dataset featuring fast-moving cloth with complex and evolving fold structures demonstrate that the adaptability of the proposed regularization scheme to nonrigid tangential motion does not hamper its robustness, since it successfully recovers the shape and motion of the cloth without overfitting it despite the absence of stretch or shear in this case.
- Citation:
- Furukawa, Y. and Ponce, J. Dense 3D Motion Capture for Human Faces. To appear, CVPR 2009, June 2009.
- On-line documents:
- Complete article (PDF)
Project
Parallax Photography: Creating 3D Cinematic Effects from Stills
- Abstract:
- We present an approach to convert a small portion of a light field with extracted depth information into a cinematic effect with simulated, smooth camera motion that exhibits a sense of 3D parallax. We develop a taxonomy of the cinematic conventions of these effects, distilled from observations of documentary film footage and organized by the number of subjects of interest in the scene. We present an automatic, content-aware approach to apply these cinematic conventions to an input light field. A face detector identifies subjects of interest. We then optimize for a camera path that conforms to a cinematic convention, maximizes apparent parallax, and avoids missing information in the input. We describe a GPU-accelerated, temporally coherent rendering algorithm that allows users to create more complex camera moves interactively, while experimenting with effects such as focal length, depth of field, and selective, depth-based desaturation or brightening. We evaluate and demonstrate our approach on a wide variety of scenes and present a user study that compares our 3D cinematic effects to their 2D counterparts.
- Citation:
- Zheng, K., Colburn, A., Agarwala, A., Agrawala, M., Curless, B., Salesin, D., and Cohen, M. Parallax Photography: Creating 3D Cinematic Effects from Stills. Proceedings of Graphics Interface 2009.
- On-line documents:
- Complete article (PDF)
Project
Dictionary-Free Categorization of Very Similar Objects via Stacked Evidence Trees
- Abstract:
- Current work in object categorization discriminates among objects that typically possess gross differences which are readily apparent. However, many applications require making much finer distinctions. We address an insect categorization problem that is so challenging that even trained human experts cannot readily categorize the insects based on their images. The state of the art that uses visual dictionaries, when applied to this problem, yields mediocre results (16.1% error). Three possible explanations for this are (a) the dictionaries are unsupervised, (b) the dictionaries lose the detailed information contained in each keypoint, and (c) these methods rely on hand-engineered decisions about dictionary size. This paper presents a novel, dictionary-free methodology. A random forest of trees is first trained to predict the class of an image based on individual keypoint descriptors. A unique aspect of these trees is that they do not make decisions but instead merely record evidence - i.e., the number of descriptors from training examples of each category that reached each leaf of the tree. We provide a mathematical model showing that voting evidence is better than voting decisions. To categorize a new image, descriptors for all detected keypoints are "dropped" through the trees, and the evidence at each leaf is summed to obtain an overall evidence vector. This is then sent to a second-level classifier to make the categorization decision. We achieve excellent performance (6.4% error) on the 9- class STONEFLY9 data set. Also, our method achieves an average AUC of 0.921 on the PASCAL06 VOC, which places it fifth out of 21 methods reported in the literature and demonstrates that the method also works well for generic object categorization.
- Citation:
- Martínez-Muñoz, G., Zhang, W., Payet, N., Todorovic, S., Larios, N., Yamamuro, A., Lytle, D., Moldenke, A., Mortensen, E., Paasch, R., Shapiro, L., and Dietterich, T.. Dictionary-Free Categorization of Very Similar Objects via Stacked Evidence Trees. To appear, CVPR 2009, June 2009.
- On-line documents:
- Complete article (PDF)
Manhattan-World Stereo
- Abstract:
- Multi-view stereo (MVS) algorithms now produce reconstructions that rival laser range scanner accuracy. However, stereo algorithms require textured surfaces, and therefore work poorly for many architectural scenes (e.g., building interiors with textureless, painted walls). This paper presents a novel MVS approach to overcome these limitations for Manhattan World scenes, i.e., scenes that consists of piece-wise planar surfaces with dominant directions. Given a set of calibrated photographs, we first reconstruct textured regions using an existing MVS algorithm, then extract dominant plane directions, generate plane hypotheses, and recover per-view depth maps using Markov random fields. We have tested our algorithm on several datasets ranging from office interiors to outdoor buildings, and demonstrate results that outperform the current state of the art for such texture-poor scenes.
- Citation:
- Furukawa, Y., Curless, B., Seitz, Steven M., and Szeliski, R. Manhattan-World Stereo. To appear, CVPR 2009, June 2009.
- On-line documents:
- Complete article (PDF)
Project
Enhancing and Experiencing Spacetime Resolution with Video and Stills
- Abstract:
- We present solutions for enhancing the spatial and/or temporal resolution of videos. Our algorithm targets the emerging consumer-level hybrid cameras that can simultaneously capture video and high-resolution stills. Our technique produces a high spacetime resolution video using the high-resolution stills for rendering and the low-resolution video to guide the reconstruction and the rendering process. Our framework integrates and extends two existing algorithms, namely a high-quality optical flow algorithm and a high-quality image-based-rendering algorithm. The framework enables a variety of applications that were previously unavailable to the amateur user, such as the ability to (1) automatically create videos with high spatiotemporal resolution, and (2) shift a high-resolution still to nearby points in time to better capture a missed event.
- Citation:
- Gupta, A., Bhat, P., Dontcheva, M., Cohen, Michael F., Curless, B., and Deussen, O.. Enhancing and Experiencing Spacetime Resolution with Videos and Stills. To appear, ICCP 2009, April 2009.
- On-line documents:
- Complete article (PDF)
Project Page
Zoetrope: Interacting with the Ephemeral Web
- Abstract:
- The Web is ephemeral. Pages change frequently, and it is nearly impossible to find data or follow a link after the underlying page evolves. We present Zoetrope, a system that enables interaction with the historicalWeb (pages, links, and embedded data) that would otherwise be lost to time. Using a number of novel interactions, the temporal Web can be manipulated, queried, and analyzed from the context of familar pages. Zoetrope is based on a set of operators for manipulating content streams. We describe these primitives and the associated indexing strategies for handling temporal Web data. They form the basis of Zoetrope and enable our construction of new temporal interactions and visualizations.
- Citation:
- Adar, E., Dontcheva, M., Fogarty, J., and Weld, D. S.. Zoetrope: Interacting with the Ephemeral Web. UIST 2008.
- On-line documents:
- Complete article (PDF, 2MB)
Adaptive Layout for Dynamically Aggregated Documents
- Abstract:
- We present a system for designing and displaying grid-based document designs that adapt to many different viewing conditions and content selections. Our system can display traditional, static documents, or it can assemble dynamic documents "on the fly" from many disparate sources via the Internet. Our adaptive layouts for aggregated documents are inspired by traditional newspaper design. Furthermore, our system allows documents to be interactive so that readers can customize documents as they read them. Our system builds on previous work on adaptive documents, using constraint based templates to specify content-independent page designs. The new templates we describe are much more flexible in their ability to adapt to different types of content and viewing situations. This flexibility comes from allowing the individual components, or "elements," of the templates to be mixed and matched, according to the content being displayed. We demonstrate our system with two example applications: an interactive news reader for the New York Times, and an Internet news aggregator based on MSN Newsbot.
- Citation:
- Schrier, E., Dontcheva, M., Jacobs, C., Wade, G., and Salesin, D.. Adaptive Layout for Dynamically Aggregated Documents. IUI 2008.
- On-line documents:
- Complete article (PDF, 4MB)
Creating Map-based Storyboards for Browsing Tour Videos
- Abstract:
- Watching a long unedited video is usually a boring experience. In this paper we examine a particular subset of videos, tour videos, in which the video is captured by walking about with a running camera with the goal of conveying the essence of some place. We present a system that makes the process of sharing and watching a long tour video easier, less boring, and more informative. To achieve this, we augment the tour video with a map-based storyboard, where the tour path is reconstructed, and coherent shots at different locations are directly visualized on the map. This allows the viewer to navigate the video in the joint location-time space. To create such a storyboard we employ an automatic pre-processing component to parse the video into coherent shots, and an authoring tool to enable the user to tie the shots with landmarks on the map. The browser-based viewing tool allows users to navigate the video in a variety of creative modes with a rich set of controls, giving each viewer a unique, personal viewing experience. Informal evaluation shows that our approach works well for tour videos compared with conventional media players.
- Citation:
- Pongnumkul, S., Wang, J., and Cohen, M.. Creating Map-based Storyboards for Browsing Tour Videos. UIST 2008.
- On-line documents:
- Complete article (PDF, 2MB)
A Salient-Point Signature for 3D Object Retrieval
- Abstract:
- In this paper we describe a new 3D object signature and evaluate its performance for 3D object retrieval. The signature is based on a learning approach that finds the characteristics of salient points on a 3D object and represents the points in a 2D spatial map based on a longitude-latitude transformation. Experimental results show that the signature is able to achieve good retrieval scores for both pose-normalized and randomly-rotated object queries.
- Citation:
- Atmosukarto, I. and Shapiro, L. G.. A Salient-Point Signature for 3D Object Retrieval, ACM Multimedia Information Retrieval (MIR), October 2008.
- On-line documents:
- Complete article (PDF, 2MB)
Fourier Analysis of the 2D Screened Poisson Equation for Gradient Domain Problems
- Abstract:
- We analyze the problem of reconstructing a 2D function that approximates a set of desired gradients and a data term. The combined data and gradient terms enable operations like modifying the gradients of an image while staying close to the original image. Starting with a variational formulation, we arrive at the "screened Poisson equation" known in physics. Analysis of this equation in the Fourier domain leads to a direct, exact, and efficient solution to the problem. Further analysis reveals the structure of the spatial filters that solve the 2D screened Poisson equation and shows gradient scaling to be a well-defined sharpen filter that generalizes Laplacian sharpening, which itself can be mapped to gradient domain filtering. Results using a DCT-based screened Poisson solver are demonstrated on several applications including image blending for panoramas, image sharpening, and de-blocking of compressed images.
- Citation:
- Bhat, P., Curless, B., Cohen, M., and Zitnick, C. L.. Fourier Analysis of the 2D Screened Poisson Equation for Gradient Domain Problems. ECCV 2008.
- On-line documents:
- Complete article (PDF, 1MB)
Project Site
Scene Segmentation Using the Wisdom of Crowds
- Abstract:
- Given a collection of images of a static scene taken by many different people, we identify and segment interesting objects. To solve this problem, we use the distribution of images in the collection along with a new field-of-view cue, which leverages the observation that people tend to take photos that frame an object of interest within the field of view. Hence, image features that appear together in many images are likely to be part of the same object. We evaluate the effectiveness of this cue by comparing the segmentations computed by our method against hand-labeled ones for several different models. We also show how the results of our segmentations can be used to highlight important objects in the scene and label them using noisy user-specified textual tag data. These methods are demonstrated on photos of several popular tourist sites downloaded from the Internet.
- Citation:
- Simon, I. and Seitz, S. M.. Scene Segmentation Using the Wisdom of Crowds. ECCV 2008.
- On-line documents:
- Complete article (PDF, 16MB)
Fast Algorithms for L_infty Problems in Multiview Geometry
- Abstract:
- Many problems in multi-view geometry, when posed as minimization of the maximum reprojection error across observations, can be solved optimally in polynomial time. We show that these problems are instances of a convex-concave generalized fractional program. We survey the major solution methods for solving problems of this form and present them in a unified framework centered around a single parametric optimization problem. We propose two new algorithms and show that the algorithm proposed by Olsson et al. [21] is a special case of a classical algorithm for generalized fractional programming. The performance of all the algorithms is compared on a variety of datasets, and the algorithm proposed by Gugat [12] stands out as a clear winner. An open source MATLAB toolbox thats implements all the algorithms presented here is made available.
- Citation:
- Agarwal, S., Snavely, N., and Seitz, S. M.. Fast Algorithms for L_infty Problems in Multiview Geometry. CVPR 2008.
- On-line documents:
- Complete article (PDF, 1MB)
Video Object Annotation, Navigation, and Composition
- Abstract:
- We explore the use of tracked 2D object motion to enable novel approaches to interacting with video. These include moving annotations, video navigation by direct manipulation of objects, and creating an image composite from multiple video frames. Features in the video are automatically tracked and grouped in an off-line preprocess that enables later interactive manipulation. Examples of annotations include speech and thought balloons, video graffiti, path arrows, video hyperlinks, and schematic storyboards. We also demonstrate a direct-manipulation interface for random frame access using spatial constraints, and a drag-and-drop interface for assembling still images from videos. Taken together, our tools can be employed in a variety of applications including film and video editing, visual tagging, and authoring rich media such as hyperlinked video.
- Citation:
- Goldman, D. B, Gonterman, C., Curless, B., Salesin, D., and Seitz, S. M.. Video Object Annotation, Navigation, and Composition. UIST 2008.
- On-line documents:
- Complete article (PDF, 7MB)
Project Site
In Defense of Nearest-Neighbor Based Image Classification
- Abstract:
- State-of-the-art image classification methods require an intensive learning/training stage (using SVM, Boosting, etc.) In contrast, non-parametric Nearest-Neighbor (NN) based image classifiers require no training time and have other favorable properties. However, the large performance gap between these two families of approaches rendered NNbased image classifiers useless.
We claim that the effectiveness of non-parametric NNbased image classification has been considerably undervalued. We argue that two practices commonly used in image classification methods, have led to the inferior performance of NN-based image classifiers: (i) Quantization of local image descriptors (used to generate "bags-of-words," codebooks). (ii) Computation of 'Image-to-Image' distance, instead of 'Image-to-Class' distance.
We propose a trivial NN-based classifier - NBNN, (Naive-Bayes Nearest-Neighbor), which employs NNdistances in the space of the local image descriptors (and not in the space of images). NBNN computes direct 'Image-to- Class' distances without descriptor quantization. We further show that under the Naive-Bayes assumption, the theoretically optimal image classifier can be accurately approximated by NBNN.
Although NBNN is extremely simple, efficient, and requires no learning/training phase, its performance ranks among the top leading learning-based image classifiers. Empirical comparisons are shown on several challenging databases (Caltech-101,Caltech-256 and Graz-01).
- Citation:
- Boiman, O., Shechtman, E., and Irani, M. In Defense of Nearest-Neighbor Based Image Classification. CVPR 2008.
- On-line documents:
- Complete article (PDF, 1MB)
Summarizing Visual Data Using Bidirectional Similarity
- Abstract:
- We propose a principled approach to summarization of visual data (images or video) based on optimization of a well-defined similarity measure. The problem we consider is re-targeting (or summarization) of image/video data into smaller sizes. A good "visual summary" should satisfy two properties: (1) it should contain as much as possible visual information from the input data; (2) it should introduce as few as possible new visual artifacts that were not in the input data (i.e., preserve visual coherence). We propose a bi-directional similarity measure which quantitatively captures these two requirements: Two signals S and T are considered visually similar if all patches of S (at multiple scales) are contained in T, and vice versa.
The problem of summarization/re-targeting is posed as an optimization problem of this bi-directional similarity measure. We show summarization results for image and video data. We further show that the same approach can be used to address a variety of other problems, including automatic cropping, completion and synthesis of visual data, image collage, object removal, photo reshuffling and more.
- Citation:
- Simakov, D., Caspi, Y., Shechtman, E., and Irani, M. Summarizing Visual Data Using Bidirectional Similarity. CVPR 2008.
- On-line documents:
- Complete article (PDF, 2.5MB)
MySong: Automatic Accompaniment Generation for Vocal Melodies
- Abstract:
- We propose a principled approach to summarization of visual data (images or video) based on optimization of a well-defined similarity measure. The problem we consider is re-targeting (or summarization) of image/video data into smaller sizes. A good "visual summary" should satisfy two properties: (1) it should contain as much as possible visual information from the input data; (2) it should introduce as few as possible new visual artifacts that were not in the input data (i.e., preserve visual coherence). We propose a bi-directional similarity measure which quantitatively captures these two requirements: Two signals S and T are considered visually similar if all patches of S (at multiple scales) are contained in T, and vice versa.
The problem of summarization/re-targeting is posed as an optimization problem of this bi-directional similarity measure. We show summarization results for image and video data. We further show that the same approach can be used to address a variety of other problems, including automatic cropping, completion and synthesis of visual data, image collage, object removal, photo reshuffling and more.
- Citation:
- Simon, I., Morris, D., and Basu, S.. MySong: Automatic Accompaniment Generation for Vocal Melodies. CHI 2008.
- On-line documents:
- Complete article (PDF, 1MB)
Finding Paths through the World's Photos
- Abstract:
- When a scene is photographed many times by different people, the viewpoints often cluster along certain paths. These paths are largely specific to the scene being photographed, and follow interesting regions and viewpoints. We seek to discover a range of such paths and turn them into controls for image-based rendering. Our approach takes as input a large set of community or personal photos, reconstructs camera viewpoints, and automatically computes orbits, panoramas, canonical views, and optimal paths between views. The scene can then be interactively browsed in 3D using these controls or with six degree-of-freedom free-viewpoint control. As the user browses the scene, nearby views are continuously selected and transformed, using control-adaptive reprojection techniques.
- Citation:
- Snavely, N., Garg, R., Seitz, S. M., and Szeliski, R. Finding Paths through the World's Photos. ACM Transactions on Graphics 27(3), August 2008.
- On-line documents:
- Complete article (PDF, 12MB)
Modeling the World from Internet Photo Collections
- Abstract:
- There are billions of photographs on the Internet, comprising the largest and most diverse photo collection ever assembled. How can computer vision researchers exploit this imagery? This paper explores this question from the standpoint of 3D scene modeling and visualization. We present structure-from-motion and image-based rendering algorithms that operate on hundreds of images downloaded as a result of keyword-based image search queries like "Notre Dame" or "Trevi Fountain." This approach, which we call Photo Tourism, has enabled reconstructions of numerous well-known world sites. This paper presents these algorithms and results as a first step towards 3D modeling of the world's well-photographed sites, cities, and landscapes from Internet imagery, and discusses key open problems and challenges for the research community.
- Citation:
- Snavely, N., Seitz, S. M., and Szeliski, R. Modeling the World from Internet Photo Collections. Accepted to IJCV, 2008.
- On-line documents:
- Complete article (PDF, 2MB)
Skeletal Graphs for Efficient Structure from Motion
- Abstract:
- We address the problem of efficient structure from motion for large, unordered, highly redundant, and irregularly sampled photo collections, such as those found on Internet photo-sharing sites. Our approach computes a small skeletal subset of images, reconstructs the skeletal set, and adds the remaining images using pose estimation. Our technique drastically reduces the number of parameters that are considered, resulting in dramatic speedups, while provably approximating the covariance of the full set of parameters. To compute a skeletal image set, we first estimate the accuracy of two-frame reconstructions between pairs of overlapping images, then use a graph algorithm to select a subset of images that, when reconstructed, approximates the accuracy of the full set. A final bundle adjustment can then optionally be used to restore any loss of accuracy.
- Citation:
- Snavely, N., Seitz, S. M., and Szeliski, R. Skeletal graphics for efficient structure from motion. CVPR 2008.
- On-line documents:
- Complete article (PDF, 2MB)
Automated Generation of Interactive 3D Exploded View Diagrams
- Abstract:
- We present a system for creating and viewing interactive exploded views of complex 3D models. In our approach, a 3D input model is organized into an explosion graph that encodes how parts explode with respect to each other. We present an automatic method for computing explosion graphs that takes into account part hierarchies in the input models and handles common classes of interlocking parts. Our system also includes an interface that allows users to interactively explore our exploded views using both direct controls and higher-level interaction modes.
- Citation:
- Li, W., Agrawala, M., Curless, B., and Salesin, D. Automated Generation of Interactive 3D Exploded View Diagrams. ACM Transactions on Graphics 27(3), August 2008.
- On-line documents:
- Complete article (PDF, 4.5MB)
Project page
Practical Global Optimization for Multiview Geometry
- Abstract:
This paper presents a practical method for finding the provably globally optimal solution to numerous problems in projective geometry including multiview triangulation, camera resectioning and homography estimation. Unlike traditional methods which may get trapped in local minima due to the non-convex nature of these problems, this approach provides a theoretical guarantee of global optimality. The formulation relies on recent developments in fractional programming and the theory of convex underestimators and allows a unified framework for minimizing the standard L2-norm of reprojection errors which is optimal under Gaussian noise as well as the more robust L1-norm which is less sensitive to outliers. Even though the worst case complexity of our algorithm is exponential, the practical efficacy is empirically demonstrated by good performance on experiments for both synthetic and real data. An open source MATLAB toolbox that implements the algorithm is also made available to facilitate further research.
- Citation:
- Practical Global Optimization for Multiview Geometry. Kahl, F., Agarwal, S., Chandraker, M., Kriegman, D. and Belongie, S.. International Journal of Computer Vision, 79(3), September 2008, pages 271-284.
- On-line documents:
- Complete article (PDF)
Rectified Surface Mosaics
- Abstract:
We approach mosaicing as a camera tracking problem within a known parameterized surface. From a video of a camera moving within a surface, we compute a mosaic representing the texture of that surface, flattened onto a planar image. Our approach works by defining a warp between images as a function of surface geometry and camera pose. Globally optimizing this warp to maximize alignment across all frames determines the camera trajectory, and the corresponding flattened mosaic image. In contrast to previous mosaicing methods which assume planar or distant scenes, or controlled camera motion, our approach enables mosaicing in cases where the camera moves unpredictably through proximal surfaces, such as in medical endoscopy applications.
- Citation:
- Rectified Surface Mosaics. Carroll, R. E. and Seitz, S. M.. IEEE Computer Society Workshop on Mathematical Methods in Biomedical Image Analysis (MMBIA 2007), Rio de Janeiro, Brazil, October 2007.
- On-line documents:
- Complete article (PDF, 2.6MB)
A Probabilistic Model for Object Recognition, Segmentation, and Non-Rigid Correspondence
- Abstract:
- We describe a method for fully automatic object recognition and segmentation using a set of reference images to specify the appearance of each object. Our method uses a generative model of image formation that takes into account occlusions, simple lighting changes, and object deformations. We take advantage of local features to identify, locate, and extract multiple objects in the presence of large viewpoint changes, nonrigid motions with large numbers of degrees of freedom, occlusions, and clutter. We simultaneously compute an object-level segmentation and a dense correspondence between the pixels of the appropriate reference images and the image to be segmented.
- Citation:
- A Probabilistic Model for Object Recognition, Segmentation, and Non-Rigid Correspondence. Simon, I. and Seitz, S. M. Proceedings of CVPR 2007, Minneapolis, Minnesota, June 2007.
- On-line documents:
- Complete article (PDF, 1.6MB)
Scene Summarization for Online Image Collections
- Abstract:
- We formulate the problem of scene summarization as selecting a set of images that efficiently represents the visual content of a given scene. The ideal summary presents the most interesting and important aspects of the scene with minimal redundancy. We propose a solution to this problem using multi-user image collections from the Internet. Our solution examines the distribution of images in the collection to select a set of canonical views to form the scene summary, using clustering techniques on visual features. The summaries we compute also lend themselves naturally to the browsing of image collections, and can be augmented by analyzing user-specified image tag data. We demonstrate the approach using a collection of images of the city of Rome, showing the ability to automatically decompose the images into separate scenes, and identify canonical views for each scene.
- Citation:
- Scene Summarization for Online Image Collections. Simon, I., Snavely, N. and Seitz, S. M. Proceedings of ICCV 2007, Rio de Janeiro, Brazil, October 2007.
- On-line documents:
- Complete article (PDF, 2.4MB)
Project Page
Multi-View Stereo for Community Photo Collections
- Abstract:
- We present a multi-view stereo algorithm that addresses the extreme changes in lighting, scale, clutter, and other effects in large online community photo collections. Our idea is to intelligently choose images to match, both at a per-view and per-pixel level. We show that such adaptive view selection enables robust performance even with dramatic appearance variability. The stereo matching technique takes as input sparse 3D points reconstructed from structure-from-motion methods and iteratively grows surfaces from these points. Optimizing for surface normals within a photoconsistency measure significantly improves the matching results. While the focus of our approach is to estimate high-quality depth maps, we also show examples of merging the resulting depth maps into compelling scene reconstructions. We demonstrate our algorithm on standard multi-view stereo datasets and on casually acquired photo collections of famous scenes gathered from the Internet.
- Citation:
- Multi-View Stereo for Community Photo Collections. Goesele, M., Snavely, N., Curless, B., Hoppe, H. and Seitz, S. M. Proceedings of ICCV 2007, Rio de Janeiro, Brazil, October 2007.
- On-line documents:
- Complete article (PDF, 9.2MB)
Project Page
Globally Optimal Affine and Metric Upgrades in Stratified Autocalibration
- Abstract:
- We present a practical, stratified autocalibration algorithm with theoretical guarantees of global optimality. Given a projective reconstruction, the first stage of the algorithm upgrades it to affine by estimating the position of the plane at infinity. The plane at infinity is computed by globally minimizing a least squares formulation of the modulus constraints. In the second stage, the algorithm upgrades this affine reconstruction to a metric one by globally minimizing the infinite homography relation to compute the dual image of the absolute conic (DIAC). The positive semidefiniteness of the DIAC is explicitly enforced as part of the optimization process, rather than as a post-processing step.
For each stage, we construct and minimize tight convex relaxations of the highly non-convex objective functions in a branch and bound optimization framework. We exploit the problem structure to restrict the search space for the DIAC and the plane at infinity to a small, fixed number of branching dimensions, independent of the number of views.
Experimental evidence of the accuracy, speed and scalability of our algorithm is presented on synthetic and real data. MATLAB code for the implementation is made available to the community.
- Citation:
- Globally Optimal Affine and Metric Upgrades in Stratified Autocalibration. Chandraker, M., Agarwal, S., Kriegman, D. and Belongie, S. Proceedings of ICCV 2007, Rio de Janeiro, Brazil, October 2007.
- On-line documents:
- Complete article (PDF, 1.5MB)
Relations, Cards, and Search Templates: User-Guided Web Data Integration and Layout
- Abstract:
- We present three new interaction techniques for aiding users in collecting and organizing Web content. First, we demonstrate an interface for creating associations between websites, which facilitate the automatic retrieval of related content. Second, we present an authoring interface that allows users to quickly merge content from many different websites into a uniform and personalized representation, which we call a card. Finally, we introduce a novel search paradigm that leverages the relationships in a card to direct search queries to extract relevant content from multipleWeb sources and fill a new series of cards instead of just returning a list of webpage URLs. Preliminary feedback from users is positive and validates our design.
- Citation:
- Relations, Cards, and Search Templates: User-Guided Web Data Integration and Layout. Dontcheva, M., Drucker, S. M., Salesin, D. H. and Cohen, M. F. Proceedings of UIST 2007, Newport, Rhode Island, October 2007.
- On-line documents:
- Complete article (PDF, 9.3MB)
Near-optimal Character Animation with Continuous Control
- Abstract:
- We present a new model for real-time character animation with multidimensional, interactive control. The underlying motion engine is data-driven, enables rapid transitions, and automatically enforces foot-skate constraints without inverse kinematics. On top of this motion space, our algorithm learns approximately optimal controllers which use a compact basis representation to guide the system through multidimensional state-goal spaces. These controllers enable real-time character animation that fluidly responds to changing user directives and environmental constraints.
- Citation:
- Near-optimal Character Animation with Continuous Control. Treuille, A., Lee, Y., and Popović, Z. ACM Transactions on Graphics 26(3), August 2007.
- On-line documents:
- Complete article (PDF, 0.8MB)
Project Page
Video Watercolorization using Bidirectional Texture Advection
- Abstract:
- In this paper, we present a method for creating watercolor-like animation, starting from video as input. The method involves two main steps: applying textures that simulate a watercolor appearance; and creating a simplified, abstracted version of the video to which the texturing operations are applied. Both of these steps are subject to highly visible temporal artifacts, so the primary technical contributions of the paper are extensions of previous methods for texturing and abstraction to provide temporal coherence when applied to video sequences. To maintain coherence for textures, we employ texture advection along lines of optical flow. We furthermore extend previous approaches by incorporating advection in both forward and reverse directions through the video, which allows for minimal texture distortion, particularly in areas of disocclusion that are otherwise highly problematic. To maintain coherence for abstraction, we employ mathematical morphology extended to the temporal domain, using filters whose temporal extents are locally controlled by the degree of distortions in the optical flow. Together, these techniques provide the first practical and robust approach for producing watercolor animations from video, which we demonstrate with a number of examples.
- Citation:
- Video Watercolorization using Bidirectional Texture Advection. Adrien Bousseau, Fabrice Neyret, Joëlle Thollot, David Salesin. ACM Transactions on Graphics 26(3), August 2007.
- On-line documents:
- Complete article (PDF, 5.3MB)
Project Page
Active Learning for Real-time Motion Controllers
- Abstract:
- This paper describes an approach to building real-time highly-controllable characters. A kinematic character controller is built on-the-fly during a capture session, and updated after each new motion clip is acquired. Active learning is used to identify which motion sequence the user should perform next, in order to improve the quality and responsiveness of the controller. Because motion clips are selected adaptively, we avoid the difficulty of manually determining which ones to capture, and can build complex controllers from scratch while significantly reducing the number of necessary motion samples.
- Citation:
- Active Learning for Real-time Motion Controllers. Seth Cooper, Aaron Hertzmann, Zoran Popović. ACM Transactions on Graphics 26(3), August 2007.
- On-line documents:
- Complete article (PDF, 2.4MB)
Project Page
Layered Depth Panoramas
- Abstract:
- Representations for interactive photorealistic visualization of scenes range from compact 2D panoramas to dataintensive 4D light fields. In this paper, we propose a technique for creating a layered representation from a sparse set of images taken with a hand-held camera. This representation, which we call a layered depth panorama (LDP), allows the user to experience 3D by off-axis panning. It combines the compelling experience of panoramas with limited 3D navigation. Our choice of representation is motivated by ease of capture and compactness. We formulate the problem of constructing the LDP as the recovery of color and geometry in a multi-perspective cylindrical disparity space. We leverage a graph cut approach to sequentially determine the disparity and color of each layer using multi-view stereo. Geometry visible through the cracks at depth discontinuities in a frontmost layer is determined and assigned to layers behind the frontmost layer. All layers are then used to render novel panoramic views with parallax. We demonstrate our approach on a variety of complex outdoor and indoor scenes.
- Citation:
- Layered Depth Panoramas. Ke Colin Zheng, Sing Bing Kang, Michael Cohen, Richard Szeliski. CVPR 2007, Minneapolis, Minnesota.
- On-line documents:
- Complete article (PDF, 4.4MB)
Project Page
Soft Scissors: An Interactive Tool for Realtime High Quality Matting
- Abstract:
- We present Soft Scissors, an interactive tool for extracting alpha mattes of foreground objects in realtime. We recently proposed a novel offline matting algorithm capable of extracting high-quality mattes for complex foreground objects such as furry animals [Wang and Cohen 2007]. In this paper we both improve the quality of our offline algorithm and give it the ability to incrementally update the matte in an online interactive setting. Our realtime system efficiently estimates foreground color thereby allowing both the matte and the final composite to be revealed instantly as the user roughly paints along the edge of the foreground object. In addition, our system can dynamically adjust the width and boundary conditions of the scissoring paint brush to approximately capture the boundary of the foreground object that lies ahead on the scissor's path. These advantages in both speed and accuracy create the first interactive tool for high quality image matting and compositing.
- Citation:
- Soft Scissors: An Interactive Tool for Realtime High Quality Matting. Jue Wang, Maneesh Agrawala and Michael Cohen. ACM Transactions on Graphics 26(3), August 2007.
- On-line documents:
- Complete article (PDF, 5.6MB)
Simultaneous Matting and Compositing
- Abstract:
- Recent work in matting, hole filling, and compositing allows image elements to be mixed in a new composite image. Previous algorithms for matting foreground elements have assumed that the new background for compositing is unknown. We show that, if the new background is known, the matting algorithm has more freedom to create a successful matte by simultaneously optimizing the matting and compositing operations.
We propose a new algorithm, that integrates matting and compositing into a single optimization process. The system is able to compose foreground elements onto a new background more efficiently and with less artifacts compared with previous approaches. In our examples, we show how one can enlarge the foreground while maintaining the wide angle view of the background. We also demonstrate composing a foreground element on top of similar backgrounds to help remove unwanted portions of the background or to re-scale or re-arrange the composite. We compare and contrast our method with a number of previous matting and compositing systems.
- Citation:
- Simultaneous Matting and Compositing. Jue Wang and Michael Cohen. CVPR 2007, Minneapolis, Minnesota.
- On-line documents:
- Complete article (PDF, 6.8MB)
Optimized Color Sampling for Robust Matting
- Abstract:
- Image matting is the problem of determining for each pixel in an image whether it is foreground, background, or the mixing parameter, "alpha," for those pixels that are a mixture of foreground and background. Matting is inherently an ill-posed problem. Previous matting approaches either use naive color sampling methods to estimate foreground and background colors for unknown pixels, or use propagation-based methods to avoid color sampling under weak assumptions about image statistics. We argue that neither method itself is enough to generate good results for complex natural images.
We analyze the weaknesses of previous matting approaches, and propose a new robust matting algorithm. In our approach we also sample foreground and background colors for unknown pixels, but more importantly, analyze the confidence of these samples. Only high confidence samples are chosen to contribute to the matting energy function which is minimized by a Random Walk. The energy function we define also contains a neighborhood term to enforce the smoothness of the matte. To validate the approach, we present an extensive and quantitative comparison between our algorithm and a number of previous approaches in hopes of providing a benchmark for future matting research.
- Citation:
- Optimized Color Sampling for Robust Matting. Jue Wang and Michael Cohen. CVPR 2007, Minneapolis, Minnesota.
- On-line documents:
- Complete article (PDF, 3.2MB)
Principal Curvature-Based Region Detector for Object Recognition
- Abstract:
- This paper presents a new structure-based interest region detector called Principal Curvature-Based Regions (PCBR) which we use for object class recognition. The PCBR interest operator detects stable watershed regions within the multi-scale principal curvature image. To detect robust watershed regions, we "clean" a principal curvature image using a combination of grayscale morphological closing and a new "eigenvector flow" hysteresis thresholding. Robustness across scales is achieved by selecting the maximal stable regions across consecutive scales. PCBR typically detects distinctive patterns distributed evenly on the objects and it shows significant robustness to local intensity perturbations and intra-class variations. We evaluate PCBR both qualitatively (through visual inspection) and quantitatively (by measuring repeatability and classification accuracy in real-world object-class recognition problems). Experiments on different benchmark datasets show that PCBR is comparable or superior to state-of-art detectors for both feature matching and object recognition problems. Moreover, we demonstrate the application of PCBR to symmetry detection.
- Citation:
- Principal Curvature-Based Region Detector for Object Recognition. Hongli Deng, Wei Zhang, Eric Mortensen, Thomas Dietterich, Linda Shapiro. CVPR 2007, Minneapolis, Minnesota.
- On-line documents:
- Complete article (PDF, 3.0MB)
Using Photographs to Enhance Videos of a Static Scene
- Abstract:
- We present a framework for automatically enhancing videos of a static scene using a few photographs of the same scene. For example, our system can transfer photographic qualities such as high resolution, high dynamic range and better lighting from the photographs to the video. Additionally, the user can quickly modify the video by editing only a few still images of the scene. Finally, our system allows a user to remove unwanted objects and camera shake from the video. These capabilities are enabled by two technical contributions presented in this paper. First, we make several improvements to a state-of-the-art multiview stereo algorithm in order to compute view-dependent depths using video, photographs, and structure-from-motion data. Second, we present a novel image-based rendering algorithm that can re-render the input video using the appearance of the photographs while preserving certain temporal dynamics such as specularities and dynamic scene lighting.
- Citation:
- Using Photographs to Enhance Videos of a Static Scene. Pravin Bhat, C. Lawrence Zitnick, Noah Snavely, Aseem Agarwala, Maneesh Agrawala, Michael Cohen, Brian Curless, Sing Bing Kang. Eurographics Symposium on Rendering 2007.
- On-line documents:
- Complete article (PDF, 20.0MB)
Project Page
Automated Insect Identification through Concatenated Histograms of Local Appearance Features
- Abstract:
- Abstract This paper describes a computer vision approach to automated rapid-throughput taxonomic identification of stonefly larvae. The long-term goal of this research is to develop a cost-effective method for environmental monitoring based on automated identification of indicator species. Recognition of stonefly larvae is challenging because they are highly articulated, they exhibit a high degree of intraspecies variation in size and color, and some species are difficult to distinguish visually, despite prominent dorsal patterning. The stoneflies are imaged via an apparatus that manipulates the specimens into the field of view of a microscope so that images are obtained under highly repeatable conditions. The images are then classified through a process that involves (a) identification of regions of interest, (b) representation of those regions as SIFT vectors [1], (c) classification of the SIFT vectors into learned "features" to form a histogram of detected features, and (d) classification of the feature histogram via state-of-the-art ensemble classification algorithms. The steps (a) to (c) compose the concatenated feature histogram (CFH) method. We apply three region detectors for part (a) above, including a newly developed principal curvature-based region (PCBR) detector. This detector finds stable regions of high curvature via a watershed segmentation algorithm. We compute a separate dictionary of learned features for each region detector, and then concatenate the histograms prior to the final classification step.
We evaluate this classification methodology on a task of discriminating among four stonefly taxa, two of which, Calineuria and Doroneuria, are difficult even for experts to discriminate. The results show that the combination of all three detectors gives four-class accuracy of 82% and three-class accuracy (pooling Calineuria and Doroneuria) of 95%. Each region detector makes a valuable contribution. In particular, our new PCBR detector is able to discriminate Calineuria and Doroneuria much better than the other detectors.
- Citation:
- Automated Insect Identification through Concatenated Histograms of Local Appearance Features: Feature Vector Generation and Region Detection for Deformable Objects. Enrique Larios, Hongli Deng, Wei Zhang, Matt Sarpola, Jenny Yuen, Robert Paasch, Andrew Moldenke, David Lytle, Salvador Ruiz Correa, Eric Mortensen, Linda Shapiro, and Tom Dietterich. In Machine Vision and Applications, 2007.
- On-line documents:
- Complete article (PDF, 0.9MB)
Interactive Cutaway Illustrations of Complex 3D Models
- Abstract:
- We present a system for authoring and viewing interactive cutaway illustrations of complex 3D models using conventions of traditional scientific and technical illustration. Our approach is based on the two key ideas that 1) cuts should respect the geometry of the parts being cut, and 2) cutaway illustrations should support interactive exploration. In our approach, an author instruments a 3D model with auxiliary parameters, which we call "rigging," that define how cutaways of that structure are formed. We provide an authoring interface that automates most of the rigging process. We also provide a viewing interface that allows viewers to explore rigged models using high-level interactions. In particular, the viewer can just select a set of target structures, and the system will automatically generate a cutaway illustration that exposes those parts. We have tested our system on a variety of CAD and anatomical models, and our results demonstrate that our approach can be used to create and view effective interactive cutaway illustrations for a variety of complex objects with little user effort.
- Citation:
- Interactive Cutaway Illustration of Complex 3D Models. Wilmot Li, Lincoln Ritter, Maneesh Agrawala, Brian Curless, David Salesin. ACM Transactions on Graphics 26(3), August 2007.
- On-line documents:
- Complete article (PDF, 15.0MB)
Project page
A Theory of Frequency Domain Invariants: Spherical Harmonic Identities for BRDF / Lighting Transfer and Image Consistency
- Abstract:
- This paper develops a theory of frequency domain invariants in computer vision. We derive novel identities using spherical harmonics, which are the angular frequency domain analog to common spatial domain invariants such as reflectance ratios. These invariants are derived from the spherical harmonic convolution framework for reflection from a curved surface. Our identities apply in a number of canonical cases, including single and multiple images of objects under the same and different lighting conditions. One important case we consider is two different glossy objects in two different lighting environments. For this case, we derive a novel identity, independent of the specific lighting configurations or BRDFs, that allows us to directly estimate the fourth image if the other three are available. The identity can also be used as an invariant to detect tampering in the images.
While this paper is primarily theoretical, it has the potential to lay the mathematical foundations for two important practical applications. First, we can develop more general algorithms for inverse rendering problems, which can directly relight and change material properties by transferring the BRDF or lighting from another object or illumination. Second, we can check the consistency of an image, to detect tampering or image splicing.
- Citation:
- A Theory Of Frequency Domain Invariants: Spherical Harmonic Identities for BRDF / Lighting Transfer and Image Consistency. Dhruv Mahajan, Ravi Ramamoorthi, and Brian Curless. To appear, IEEE Pattern Analysis and Machine Intelligence.
- On-line documents:
- Complete article (PDF, 3.0MB)
Devices That Tell On You: Privacy Trends in Consumer Ubiquitous Computing
- Abstract:
- We analyze three new consumer electronic gadgets in order to gauge the privacy and security trends in mass-market UbiComp devices. Our study of the Slingbox Pro uncovers a new information leakage vector for encrypted streaming multimedia. By exploiting properties of variable bitrate encoding schemes, we show that a passive adversary can determine with high probability the movie that a user is watching via her Slingbox, even when the Slingbox uses encryption. We experimentally evaluated our method against a database of over 100 hours of network traces for 26 distinct movies.
Despite an opportunity to provide significantly more location privacy than existing devices, like RFIDs, we find that an attacker can trivially exploit the Nike+iPod Sport Kit's design to track users; we demonstrate this with a GoogleMaps-based distributed surveillance system. We also uncover security issues with the way Microsoft Zunes manage their social relationships.
We show how these products' designers could have significantly raised the bar against some of our attacks. We also use some of our attacks to motivate fundamental security and privacy challenges for future UbiComp devices.
- Citation:
- Devices That Tell On You: Privacy Trends in Consumer Ubiquitous Computing. T. Scott Saponas, Jonathan Lester, Carl Hartung, Sameer Agarwal and Tadayoshi Kohno, to appear USENIX Security 2007.
- On-line documents:
- Complete article (PDF, 1.5MB)
Project Page
Generalized Non-metric Multidimensional Scaling
- Citation:
- Generalized Non-metric Multidimensional Scaling. Sameer Agarwal, Josh Wills, Lawrence Cayton, Gert Lanckriet, David Kriegman and Serge Belongie. AISTATS 2007, San Juan, Puerto Rico.
- On-line documents:
- Complete article (PDF, 0.9MB)
ShadowCuts: Photometric Stereo with Shadows
- Abstract:
- We present an algorithm for performing Lambertian photometric stereo in the presence of shadows. The algorithm has three novel features. First, a fast graph cuts based method is used to estimate per pixel light source visibility. Second, it allows images to be acquired with multiple illuminants, and there can be fewer images than light sources. This leads to better surface coverage and improves the reconstruction accuracy by enhancing the signal to noise ratio and the condition number of the light source matrix. The ability to use fewer images than light sources means that the imaging effort grows sublinearly with the number of light sources. Finally, the recovered shadow maps are combined with shading information to perform constrained surface normal integration. This reduces the low frequency bias inherent to the normal integration process and ensures that the recovered surface is consistent with the shadowing configuration.
The algorithm works with as few as four light sources and four images. We report results for light source visibility detection and high quality surface reconstructions for synthetic and real datasets.
- Citation:
- ShadowCuts: Photometric Stereo with Shadows. Manmohan Chandraker, Sameer Agarwal, David Kriegman. CVPR 2007, Minneapolis, Minnesota.
- On-line documents:
- Complete article (PDF, 1.6MB)
Autocalibration via Rank-Constrained Estimation of the Absolute Quadric
- Abstract:
- We present an autocalibration algorithm for upgrading a projective reconstruction to a metric reconstruction by estimating the absolute dual quadric. The algorithm enforces the rank degeneracy and the positive semidefiniteness of the dual quadric as part of the estimation procedure, rather than as a post-processing step. Furthermore, the method allows the user, if he or she so desires, to enforce conditions on the plane at infinity so that the reconstruction satisfies the chirality constraints.
The algorithm works by constructing low degree polynomial optimization problems, which are solved to their global optimum using a series of convex linear matrix inequality relaxations. The algorithm is fast, stable, robust and has time complexity independent of the number of views. We show extensive results on synthetic as well as real datasets to validate our algorithm.
- Citation:
- Autocalibration via Rank-Constrained Estimation of the Absolute Quadric. Manmohan Chandraker, Sameer Agarwal, Fredrik Kahl, David Nistér, David Kriegman. CVPR 2007, Minneapolis, Minnesota.
- On-line documents:
- Complete article (PDF, 0.2MB)
Stylizing 2.5-D Video
- Abstract:
- In recent years considerable interest has been given to non-photorealistic rendering of photographs, video, and 3D models for illustrative or artistic purposes. Conventional 2D inputs such as photographs and video are easy to create and capture, while 3D models allow for a wider variety of stylization techniques, such as cross-hatching. In this paper, we propose using video with depth information (2.5D video) to combine the advantages of 2D and 3D input. 2.5D video is becoming increasingly easy to capture, and with the additional depth information, stylization techniques that require shape information can be applied. However, because 2.5D video contains only limited shape information and 3D correspondence over time is unknown, it is difficult to create temporally coherent stylized animations directly from raw 2.5D video. In this paper, we present techniques for processing 2.5D video to overcome these drawbacks, and demonstrate several styles that can be created using these techniques.
- Citation:
- Stylizing 2.5D video. Noah Snavely, C. Lawrence Zitnick, Sing Bing Kang, Michael Cohen. In Proc. Symposium on Non-Photorealistic Animation and Rendering (NPAR) 2006, pages 63-69.
- On-line documents:
- Complete article (PDF, 0.8MB)
Summarizing Personal Web Browsing Sessions
- Abstract:
- We describe a system, implemented as a browser extension, that enables users to quickly and easily collect, view, and share personal Web content. Our system employs a novel interaction model, which allows a user to specify webpage extraction patterns by interactively selecting webpage elements and applying these patterns to automatically collect similar content. Further, we present a technique for creating visual summaries of the collected information by combining user labeling with predefined layout templates. These summaries are interactive in nature: depending on the behaviors encoded in their templates, they may respond to mouse events, in addition to providing a visual summary. Finally, the summaries can be saved or sent to other users to continue the research at another place or time. Informal evaluation shows that our approach works well for popular websites, and that users can quickly learn this interaction model for collecting Web content.
- Citation:
- Mira Dontcheva, Steven Drucker, Geraldine Wade, David Salesin and Michael F. Cohen. Summarizing Personal Web Browsing Sessions. Proceedings of ACM UIST 2006.
- On-line documents:
- Complete article (PDF, 5.0MB)
Project Page
Painting With Texture
- Abstract:
- We present an interactive texture painting system that allows the user to author digital images by painting with a palette of input textures. At the core of our system is an interactive texture synthesis algorithm that generates textures with natural-looking boundary effects and alpha information as the user paints. Furthermore, we describe an intuitive layered painting model that allows strokes of texture to be merged, intersected and overlapped while maintaining the appropriate boundaries between texture regions. We demonstrate the utility and expressiveness of our system by painting several images using textures that exhibit a range of different boundary effects.
- Citation:
- Lincoln Ritter, Wilmot Li, Maneesh Agrawala, Brian Curless, David Salesin. Paitning With Texture. Proceedings of the 17th Eurographics Symposium on Rendering, 2006.
- On-line documents:
- Complete article (PDF, 1.7MB)
Project Page
Learning a correlated model of identity and pose-dependent body shape variation for real-time synthesis
- Abstract:
- We present a method for learning a model of human body shape variation from a corpus of 3D range scans. Our model is the first to capture both identity-dependent and pose-dependent shape variation in a correlated fashion, enabling creation of a variety of virtual human characters with realistic and non-linear body deformations that are customized to the individual. Our learning method is robust to irregular sampling in pose-space and identity space, and also to missing surface data in the examples. Our synthesized character models are based on standard skinning techniques and can be rendered in real time.
- Citation:
- Brett Allen, Brian Curless, Zoran Popović, Aaron Hertzmann. Learning a correlated model of identity and pose-dependent body shape variation for real-time synthesis. Proceedings of the ACM SIGGRAPH / Eurographics Symposium on Computer Animation, 2006, pp. 147-156.
- On-line documents:
- Complete article (PDF, 1.9MB)
Project Page
Gaze-Based Interaction for Semi-Automatic Photo Cropping
- Abstract:
- We present an interactive method for cropping photographs given minimal information about the location of important content, provided by eye tracking. Cropping is formulated in a general optimization framework that facilitates adding new composition rules, as well as adapting the system to particular applications. Our system uses fixation data to identify important content and compute the best crop for any given aspect ratio or size, enabling applications such as automatic snapshot recomposition, adaptive documents, and thumbnailing. We validate our approach with studies in which users compare our crops to ones produced by hand and by a completely automatic approach. Experiments show that viewers prefer our gaze-based crops to uncropped images and fully automatic crops.
- Citation:
- Anthony Santella, Maneesh Agrawala, Doug DeCarlo, David H. Salesin, Michael F. Cohen. Gaze-Based Interaction for Semi-Automatic Photo Cropping. ACM Human Factors in Computing Systems (CHI), 2006, pp. 771-780.
- On-line documents:
- Complete article (PDF, 2.2MB)
Project Page
Photo Tourism: Exploring Photo Collections in 3D
- Abstract:
- We present a system for interactively browsing and exploring large unstructured collections of photographs of a scene using a novel 3D interface. Our system consists of an image-based modeling front end that automatically computes the viewpoint of each photograph as well as a sparse 3D model of the scene and image to model correspondences. Our photo explorer uses image-based rendering techniques to smoothly transition between photographs, while also enabling full 3D navigation and exploration of the set of images and world geometry, along with auxiliary information such as overhead maps. Our system also makes it easy to construct photo tours of scenic or historic locations, and to annotate image details, which are automatically transferred to other relevant images. We demonstrate our system on several large personal photo collections as well as images gathered from Internet photo sharing sites.
- Citation:
- Noah Snavely, Steven M. Seitz, Richard Szeliski. Photo Tourism: Exploring Photo Collections in 3D. ACM Transactions on Graphics 25(3) (ACM SIGGRAPH 2006), July 2006.
- On-line documents:
- Complete article (PDF, 1.7MB)
Project Page
The Cartoon Animation Filter
- Abstract:
- We present the "Cartoon Animation Filter," a simple filter that takes an arbitrary input motion signal and modulates it in such a way that the output motion is more "alive" or "animated." The filter adds a smoothed, inverted, and (sometimes) time shifted version of the second derivative (the acceleration) of the signal back into the original signal. Almost all parameters of the filter are automated. The user only needs to set the desired strength of the filter. The beauty of the animation filter lies in its simplicity and generality. We apply the filter to motions ranging from hand drawn trajectories, to simple animations within PowerPoint presentations, to motion captured DOF curves, to video segmentation results. Experimental results show that the filtered motion exhibits anticipation, follow-through, exaggeration and squash-and-stretch effects which are not present in the original input motion data.
- Citation:
- Jue Wang, Steven M. Drucker, Maneesh Agrawala, Michael F. Cohen. The Cartoon Animation Filter. ACM Transactions on Graphics 25(3) (ACM SIGGRAPH 2006), July 2006.
- On-line documents:
- Complete article (PDF, 0.6MB)
Project Page
Composition of Complex Optimal Multi-Character Motions
- Abstract:
- This paper presents a physics-based method for creating complex multi-character motions from short singlecharacter sequences. We represent multi-character motion synthesis as a spacetime optimization problem where constraints represent the desired character interactions. We extend standard spacetime optimization with a novel timewarp parameterization in order to jointly optimize the motion and the interaction constraints. In addition, we present an optimization algorithm based on block coordinate descent and continuations that can be used to solve large problems multiple characters usually generate. This framework allows us to synthesize multi-character motion drastically different from the input motion. Consequently, a small set of input motion dataset is sufficient to express a wide variety of multi-character motions.
- Citation:
- C. Karen Liu, Aaron Hertzmann, Zoran Popović. Composition of Complex Optimal Multi-Character Motions. ACM SIGGRAPH / Eurographics Symposium on Computer Animation, 2006.
- On-line documents:
- Complete article (PDF, 3.5MB)
Project Page
Photographing Long Scenes with Multi-Viewpoint Panoramas
- Abstract:
- We present a system for producing multi-viewpoint panoramas of long, roughly planar scenes, such as the facades of buildings along a city street, from a relatively sparse set of photographs captured with a handheld still camera that is moved along the scene. Our work is a significant departure from previous methods for creating multiviewpoint panoramas, which composite thin vertical strips from a video sequence captured by a translating video camera, in that the resulting panoramas are composed of relatively large regions of ordinary perspective. In our system, the only user input required beyond capturing the photographs themselves is to identify the dominant plane of the photographed scene; our system then computes a panorama automatically using Markov Random Field optimization. Users may exert additional control over the appearance of the result by drawing rough strokes that indicate various high-level goals. We demonstrate the results of our system on several scenes, including urban streets, a river bank, and a grocery store aisle.
- Citation:
- Aseem Agarwala, Maneesh Agrawala, Michael F. Cohen, David H. Salesin, Richard Szeliski. Photographing Long Scenes with Multi-Viewpoint Panoramas. ACM Transactions on Graphics 25(3) (ACM SIGGRAPH 2006), July 2006.
- On-line documents:
- Complete article (PDF, 4.3MB)
Project Page
Volumetric Density Capture From a Single Image
- Abstract:
- We propose a new approach to capture the volumetric density of scattering media instantaneously with a single image. The volume is probed with a set of laser lines and the scattered intensity is recorded by a conventional camera. We then determine the density along the laser lines taking the scattering properties of the media into account. A specialized interpolation technique reconstructs the full density field in the volume. We apply the technique to capture the volumetric density of participating media such as smoke.
- Citation:
- Christian Fuchs, Tongbo Chen, Michael Goesele, Holger Theisel, Hans-Peter Seidel. Volumetric Density Capture From a Single Image, Proceedings of the International Workshop on Volume Graphics 2006, July 2006.
- On-line documents:
- Complete article (PDF, 3.5MB)
Model Reduction for Real-time Fluids
- Abstract:
- We present a new model reduction approach to fluid simulation, enabling large, real-time, detailed flows with continuous user interaction. Our reduced model can also handle moving obstacles immersed in the flow. We create separate models for the velocity field and for each moving boundary, and show that the coupling forces may be reduced as well. Our results indicate that surprisingly few basis functions are needed to resolve small but visually important features such as spinning vortices.
- Citation:
- Adrien Treuille, Andrew Lewis, Zoran Popović. Model Reduction for Real-time Fluids, ACM Transactions on Graphics 25(3) (SIGGRAPH 2006), July 2006.
- On-line documents:
- Complete article (PDF, 3MB)
Project Page
Continuum Crowds
- Abstract:
- We present a real-time crowd model based on continuum dynamics. In our model, a dynamic potential field simultaneously integrates global navigation with moving obstacles such as other people, efficiently solving for the motion of large crowds without the need for explicit collision avoidance. Simulations created with our system run at interactive rates, demonstrate smooth flow under a variety of conditions, and naturally exhibit emergent phenomena that have been observed in real crowds.
- Citation:
- Adrien Treuille, Seth Cooper, Zoran Popović. Continuum Crowds, ACM Transactions on Graphics 25(3) (SIGGRAPH 2006), July 2006.
- On-line documents:
- Complete article (PDF, 3.4MB)
Project Page
Schematic Storyboards for Video Visualization and Editing
- Abstract:
- We present a method for visualizing short video clips in a single static image, using the visual language of storyboards. These schematic storyboards are composed from multiple input frames and annotated using outlines, arrows, and text describing the motion in the scene. The principal advantage of this storyboard representation over standard representations of video generally either a static thumbnail image or a playback of the video clip in its entirety is that it requires only a moment to observe and comprehend but at the same time retains much of the detail of the source video. Our system renders a schematic storyboard layout based on a small amount of user interaction.We also demonstrate an interaction technique to scrub through time using the natural spatial dimensions of the storyboard. Potential applications include video editing, surveillance summarization, assembly instructions, composition of graphic novels, and illustration of camera technique for film studies.
- Citation:
- Dan B Goldman, Brian Curless, David H. Salesin, Steven M. Seitz. Schematic Storyboarding for Video Visualization and Editing, ACM Transactions on Graphics 25(3), (SIGGRAPH 2006), July 2006.
- On-line documents:
- Complete article (PDF, 6MB)
Project Page
Spatio-Angular Resolution Tradeoff in Integral Photography
- Abstract:
- An integral camera samples the 4D light field of a scene within a single photograph. This paper explores the fundamental tradeoff between spatial resolution and angular resolution that is inherent to integral photography. Based on our analysis we divide previous integral camera designs into two classes depending on how the 4D light field is distributed (multiplexed) over the 2D sensor. Our optical treatment is mathematically rigorous and extensible to the broader area of light field research. We argue that for many real-world scenes it is beneficial to sacrifice angular resolution for higher spatial resolution. The missing angular resolution is then interpolated using techniques from computer vision. We have developed a prototype integral camera that uses a system of lenses and prisms as an external attachment to a conventional camera. We have used this prototype to capture the light fields of a variety of scenes. We show examples of novel view synthesis and refocusing where the spatial resolution is significantly higher than is possible with previous designs.
- Citation:
- Todor Georgiev, Ke Colin Zheng, Brian Curless, David H. Salesin, Shree Nayar, Chintan Intwala. Spatio-Angular Resolution Tradeoff in Integral Photography, Proceedings of Eurographics Symposium on Rendering, 2006.
- On-line documents:
- Complete article (PDF, 0.6MB)
Project Page
Multi-View Stereo Revisited
- Abstract:
- We present an extremely simple yet robust multi-view stereo algorithm and analyze its properties. The algorithm first computes individual depth maps using a window-based voting approach that returns only good matches. The depth maps are then merged into a single mesh using a straightforward volumetric approach. We show results for several datasets, showing accuracy comparable to the best of the current state of the art techniques and rivaling more complex algorithms.
- Citation:
- Michael Goesele, Steven M. Seitz and Brian Curless. Multi-View Stereo Revisited, Proceedings of CVPR 2006, New York, NY, USA, June 2006.
- On-line documents:
- Complete article (PDF, 5.3MB)
Mesostructure from Specularity
- Abstract:
- We describe a simple and robust method for surface mesostructure acquisition. Our method builds on the observation that specular reflection is a reliable visual cue for surface mesostructure perception. In contrast to most photometric stereo methods, which take specularities as outliers and discard them, we propose a progressive acquisition system that captures a dense specularity field as the only information for mesostructure reconstruction. Our method can efficiently recover surfaces with fine-scale geometric details from complex real-world objects with a wide variety of reflection properties, including translucent, low albedo, and highly specular objects. We show results for a variety of objects including human skin, dried apricot, orange, jelly candy, black leather and dark chocolate.
- Citation:
- Tongbo Chen, Michael Goesele and Hans-Peter Seidel. Mesostructure from Specularity, Proceedings of CVPR 2006, New York, NY, USA, June 2006.
- On-line documents:
- Complete article (PDF, 4.0MB)
Project Page
Piecewise Image Registration in the Presence of Multiple Large Motions
- Abstract:
- We present a technique for computing a dense pixel correspondence between two images of a scene containing multiple large, rigid motions. We model each motion with either a homography (for planar objects) or a fundamental matrix. The various motions in the scene are first extracted by clustering an initial sparse set of correspondences between feature points; we then perform a multi-label graph cut optimization which assigns each pixel to an independent motion and computes its disparity with respect to that motion. We demonstrate our technique on several example scenes and compare our results with previous approaches.
- Citation:
- Pravin Bhat, Ke Colin Zheng, Noah Snavely, Aseem Agarwala, Maneesh Agrawala, Michael F. Cohen and Brian Curless. Piecewise Image Registration in the Presence of Multiple Large Motions, Proceedings of CVPR 2006, New York, NY, USA, June 2006.
- On-line documents:
- Complete article (PDF, 0.8MB)
A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms
- Abstract:
- This paper presents a quantitative comparison of several multi-view stereo reconstruction algorithms. Until now, the lack of suitable calibrated multi-view image datasets with known ground truth (3D shape models) has prevented such direct comparisons. In this paper, we first survey multi-view stereo algorithms and compare them qualitatively using a taxonomy that differentiates their key properties. We then describe our process for acquiring and calibrating multiview image datasets with high-accuracy ground truth and introduce our evaluation methodology. Finally, we present the results of our quantitative comparison of state-of-the-art multi-view stereo reconstruction algorithms on six benchmark datasets. The datasets, evaluation details, and instructions for submitting new models are available online at http://vision.middlebury.edu/mview.
- Citation:
- Steven M. Seitz, Brian Curless, James Diebel, Daniel Scharstein, and Richard Szeliski. A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms, Proceedings of CVPR 2006, New York, NY, USA, June 2006.
- On-line documents:
- Complete article (PDF, 1.8MB)
Project Page
A Theory of Spherical Harmonic Identities for BRDF/Lighting Transfer and Image Consistency
- Abstract:
- We develop new mathematical results based on the spherical harmonic convolution framework for reflection from a curved surface. We derive novel identities, which are the angular frequency domain analogs to common spatial domain invariants such as reflectance ratios. They apply in a number of canonical cases, including single and multiple images of objects under the same and different lighting conditions. One important case we consider is two different glossy objects in two different lighting environments. Denote the spherical harmonic coefficients by Blight,materiallm, where the subscripts refer to the spherical harmonic indices, and the superscripts to the lighting (1 or 2) and object or material (again 1 or 2). We derive a basic identity, B1,1lmB2,2lm = B1,2lmB2,1lm, independent of the specific lighting configurations or BRDFs. While this paper is primarily theoretical, it has the potential to lay the mathematical foundations for two important practical applications. First, we can develop more general algorithms for inverse rendering problems, which can directly relight and change material properties by transferring the BRDF or lighting from another object or illumination. Second, we can check the consistency of an image, to detect tampering or image splicing.
- Citation:
- Dhruv Mahajan, Ravi Ramamoorthi and Brian Curless. A Theory of Spherical Harmonic Identities for BRDF/Lighting Transfer and Image Consistency, in Proceedings of the Ninth European Conference on Computer Vision (ECCV 2006), Graz, Austria, May 2006.
- On-line documents:
- Complete article (PDF, 16.0MB)
Audio Analogies: Creating new music from an existing performance by concatenative synthesis
- Abstract:
- This paper describes a method for creating new music by concatenative synthesis. Given a MIDI score and an audio recording of an example piece of monophonic music, our method synthesizes audio to correspond with a new MIDI score. The algorithm we use is based on concatenative synthesis, commonly used for generating speech. Two versions of our algorithm are explored, one in which individual notes from the example piece are concatenated, and one in which pairs of adjacent notes from the example piece are concatenated. We examine the range of example pieces and target scores for which each version of our algorithm yields good results. Our underlying framework remains general enough to be applicable to other problems, such as rendering a stylized version of the target score, and other types of sound analogies.
- Citation:
- Audio Analogies: Creating new music from an existing performance by concatenative synthesis. Simon, I., Basu, S., Salesin, D. H. and Agrawala, M. Proceedings of ICMC 2005, Barcelona, Spain.
- On-line documents:
- Complete article (PDF, 0.4MB)
Dance reveals symmetry especially in young men
- Abstract:
- Dance is a common part of human courtship. Is it just for fun or does it carry a hidden message? This question was tackled in a population -- Jamaican -- where dance is particularly important. One property that dance might reflect is bodily symmetry, often used in evolutionary studies to measure developmental stability and genetic quality. A study using motion capture cameras to create video images of the dancers reveals a strong link between symmetry and dancing ability. The effect is stronger for men than for women, and women rate dances by symmetrical men relatively more positively than do men. It works both ways; symmetrical men value symmetry in women dancers more highly than less symmetrical men. In Jamaica at least, it seems that dance is a factor in sexual selection and reveals important information about the dancer. Freeze-frame images on the cover (by William M. Brown) show a symmetrical male dancer in action.
- Citation:
- William M. Brown, Lee Cronk, Keith Grochow, Amy Jacobson, C. Karen Liu, Zoran Popović, Robert Trivers. Dance reveals symmetry especially in young men. Nature 438(7071), 22 Dec 2005, pp. 1148-1150.
- On-line documents:
- Complete article (PDF, 0.2MB)
Project Page
A Theory of Inverse Light Transport
- Abstract:
- In this paper we consider the problem of computing and removing interreflections in photographs of real scenes. Towards this end, we introduce the problem of inverse light transport -- given a photograph of an unknown scene, decompose it into a sum of n-bounce images, where each image records the contribution of light that bounces exactly n times before reaching the camera. We prove the existence of a set of interreflection cancelation operators that enable computing each n-bounce image by multiplying the photograph by a matrix. This matrix is derived from a set of "impulse images" obtained by probing the scene with a narrow beam of light. The operators work under unknown and arbitrary illumination, and exist for scenes that have arbitrary spatially-varying BRDFs. We derive a closedform expression for these operators in the Lambertian case and present experiments with textured and untextured Lambertian scenes that confirm our theory's predictions.
- Citation:
- Steven M. Seitz, Yasuyuki Matsushita and Kiriakos N. Kutulakos. A Theory of Inverse Light Transport, in Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV 2005), Beijing, China, October 2005.
- On-line documents:
- Complete article (PDF, 4.0MB)
Vignette and Exposure Calibration and Compensation
- Abstract:
- We discuss calibration and removal of "vignetting" (radial falloff) and exposure (gain) variations from sequences of images. Unique solutions for vignetting, exposure and scene radiances are possible when the response curve is known. When the response curve is unknown, an exponential ambiguity prevents us from recovering these parameters uniquely. However, the vignetting and exposure variations can nonetheless be removed from the images without resolving this ambiguity. Applications include panoramic image mosaics, photometry for material reconstruction, imagebased rendering, and preprocessing for correlation-based vision algorithms.
- Citation:
- Dan B Goldman and Jiun-Hung Chen. Vignette and Exposure Calibration and Compensation, in Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV 2005), Beijing, China, October 2005.
- On-line documents:
- Complete article (PDF, 6.0MB)
Shape and Spatially-Varying BRDFs From Photometric Stereo
- Abstract:
- This paper describes a photometric stereo method designed for surfaces with spatially-varying BRDFs, including surfaces with both varying diffuse and specular properties. Our method builds on the observation that most objects are composed of a small number of fundamental materials. This approach recovers not only the shape but also material BRDFs and weight maps, yielding compelling results for a wide variety of objects. We also show examples of interactive lighting and editing operations made possible by our method.
- Citation:
- Dan B Goldman, Brian Curless, Aaron Hertzmann and Steven M. Seitz. Shape and Spatially-Varying BRDFs From Photometric Stereo, in Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV 2005), Beijing, China, October 2005.
- On-line documents:
- Complete article (PDF, 6.0MB)
Parameter Estimation for MRF Stereo
- Abstract:
- This paper presents a novel approach for estimating parameters for MRF-based stereo algorithms. This approach is based on a new formulation of stereo as a maximum a posterior (MAP) problem, in which both a disparity map and MRF parameters are estimated from the stereo pair itself. We present an iterative algorithm for the MAP estimation that alternates between estimating the parameters while fixing the disparity map and estimating the disparity map while fixing the parameters. The estimated parameters include robust truncation thresholds, for both data and neighborhood terms, as well as a regularization weight. The regularization weight can be either a constant for the whole image, or spatially-varying, depending on local intensity gradients. In the latter case, the weights for intensity gradients are also estimated. Experiments indicate that our approach, as a wrapper for existing stereo algorithms, moves a baseline belief propagation stereo algorithm up six slots in the Middlebury rankings.
- Citation:
- Li Zhang and Steven M. Seitz. Parameter Estimation for MRF Stereo, in Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), San Diego CA, June 2005.
- On-line documents:
- Complete article (PDF, 1.0MB)
Project Web Page
Interactive Video Cutout
- Abstract:
- We present an interactive system for efficiently extracting foreground objects from a video. We extend previous min-cut based image segmentation techniques to the domain of video with four new contributions. We provide a novel painting-based user interface that allows users to easily indicate the foreground object across space and time. We introduce a hierarchical mean-shift preprocess in order to minimize the number of nodes that min-cut must operate on. Within the min-cut we also define new local cost functions to augment the global costs defined in earlier work. Finally, we extend 2D alpha matting methods designed for images to work with 3D video volumes. We demonstrate that our matting approach preserves smoothness across both space and time. Our interactive video cutout system allows users to quickly extract foreground objects from video sequences for use in a variety of applications including compositing onto new backgrounds and NPR cartoon style rendering.
- Citation:
- Jue Wang, Pravin Bhat, R. Alex Colburn, Maneesh Agrawala, Michael F. Cohen. ACM Transactions on Graphics 24(3), July 2005.
- On-line documents:
- Complete article (PDF, 60MB)
Animating Pictures with Stochastic Motion Textures
- Abstract:
- In this paper, we explore the problem of enhancing still pictures with subtly animated motions. We limit our domain to scenes containing passive elements that respond to natural forces in some fashion. We use a semi-automatic approach, in which a human user segments the scene into a series of layers to be individually animated. Then, a "stochastic motion texture" is automatically synthesized using a spectral method, i.e., the inverse Fourier transform of a filtered noise spectrum. The motion texture is a time-varying 2D displacement map, which is applied to each layer. The resulting warped layers are then recomposited to form the animated frames. The result is a looping video texture created from a single still image, which has the advantages of being more controllable and of generally higher image quality and resolution than a video texture created from a video source. We demonstrate the technique on a variety of photographs and paintings.
- Citation:
- Yung-Yu Chuang, Dan B Goldman, Ke Colin Zheng, Brian Curless, David H. Salesin, Richard Szeliski. ACM Transactions on Graphics 24(3), July 2005.
- On-line documents:
- Complete article (PDF, 1.3MB)
Project web page
Learning Physics-based Motion Style with Nonlinear Inverse Optimization
- Abstract:
- This paper presents a novel physics-based representation of realistic character motion. The dynamical model incorporates several factors of locomotion derived from the biomechanical literature, including relative preferences for using some muscles more than others, elastic mechanisms at joints due to the mechanical properties of tendons, ligaments, and muscles, and variable stiffness at joints depending on the task. When used in a spacetime optimization framework, the parameters of this model define a wide range of styles of natural human movement.
Due to the complexity of biological motion, these style parameters are too difficult to design by hand. To address this, we introduce Nonlinear Inverse Optimization, a novel algorithm for estimating optimization parameters from motion capture data. Our method can extract the physical parameters from a single short motion sequence. Once captured, this representation of style is extremely flexible: motions can be generated in the same style but performing different tasks, and styles may be edited to change the physical properties of the body.
- Citation:
- C. Karen Liu, Aaron Hertzmann, Zoran Popović. ACM Transactions on Graphics 24(3), July 2005.
- On-line documents:
- Complete article (PDF, 954KB)
Project web page
Panoramic Video Textures
- Abstract:
- This paper describes a mostly automatic method for taking the output of a single panning video camera and creating a panoramic video texture (PVT): a video that has been stitched into a single, wide field of view and that appears to play continuously and indefinitely. The key problem in creating a PVT is that although only a portion of the scene has been imaged at any given time, the output must simultaneously portray motion throughout the scene. Like previous work in video textures, our method employs min-cut optimization to select fragments of video that can be stitched together both spatially and temporally. However, it differs from earlier work in that the optimization must take place over a much larger set of data. Thus, to create PVTs, we introduce a dynamic programming step, followed by a novel hierarchical min-cut optimization algorithm. We also use gradient-domain compositing to further smooth boundaries between video fragments. We demonstrate our results with an interactive viewer in which users can interactively pan and zoom on high-resolution PVTs.
- Citation:
- Aseem Agarwala, Ke Colin Zheng, Chris Pal, Maneesh Agrawala, Michael Cohen, Brian Curless, David H. Salesin, Richard Szeliski. ACM Transactions on Graphics 24(3), July 2005.
- On-line documents:
- Complete article (PDF, 954KB)
Project web page
Physically Based Rigging for Deformable Characters
- Abstract:
- In this paper we introduce a framework for instrumenting ("rigging") characters that are modeled as dynamic elastic bodies, so that their shapes can be controlled by an animator. Because the shape of such a character is determined by physical dynamics, the rigging system cannot simply dictate the shape as in traditional animation. For this reason, we introduce forces as the building blocks of rigging. Rigging forces guide the shape of the character, but are combined with other forces during simulation. Forces have other desirable features: they can be combined easily and simulated at any resolution, and since they are not tightly coupled with the surface geometry, they can be more easily transferred from one model to another. Our framework includes a new pose-dependent linearization scheme for elastic dynamics, which ensures a correspondence between forces and deformations, and at the same time produces plausible results at interactive speeds. We also introduce a novel method of handling collisions around creases.
- Citation:
- Steve Capell, Matthew Burkhart, Brian Curless, Tom Duchamp, and Zoran Popović. Proceedings of ACM SIGGRAPH / Eurographics Symposium on Computer Animation, 2005.
Extended version: Steve Capell, Matthew Burkhart, Brian Curless, Tom Duchamp, and Zoran Popović. Graphical Models, vol. 69, p. 71-87, 2007.
- On-line documents:
- Complete article (PDF, 4MB)
Project web page
If you would like an electronic copy of the extended version for non-commercial research and educational use only, please email Steve Capell (see the Grail people page).
Interactive, Image-Based Exploded View Diagrams
- Abstract:
- We present a system for creating interactive exploded view diagrams using 2D images as input. This imagebased approach enables us to directly support arbitrary rendering styles, eliminates the need for building 3D models, and allows us to leverage the abundance of existing static diagrams of complex objects.We have developed a set of semi-automatic authoring tools for quickly creating layered diagrams that allow the user to specify how the parts of an object expand, collapse, and occlude one another.We also present a viewing system that lets users dynamicallylter the information presented in the diagram by directly expanding and collapsing the exploded view and searching for individual parts. Our results demonstrate that a simple 2.5D diagram representation is powerful enough to enable a useful set of interactions and that, with the right authoring tools, effective interactive diagrams in this format can be created from existing static illustrations with a small amount of effort.
- Citation:
- Wilmot Li, Maneesh Agrawala, David H. Salesin. Interactive Image-Based Exploded View Diagrams, Graphics Interface 2004, May 2004.
- On-line documents:
Project Web Page
Example-Based Stereo with General BRDFs
- Abstract:
- This paper presents an algorithm for voxel-based reconstruction of objects with general reflectance properties from multiple calibrated views. It is assumed that one or more reference objects with known geometry are imaged under the same lighting and camera conditions as the object being reconstructed. The unknown object is reconstructed using a radiance basis inferred from the reference objects. Each view may have arbitrary, unknown distant lighting. If the lighting is calibrated, our model also takes into account shadows that the object casts upon itself. To our knowledge, this is the first stereo method to handle general, unknown, spatially-varying BRDFs under possibly varying, distant lighting, and shadows. We demonstrate our algorithm by recovering geometry and surface normals for objects with both uniform and spatially-varying BRDFs. The normals reveal fine-scale surface detail, allowing much richer renderings than the voxel geometry alone.
- Citation:
- Treuille, Adrien, Hertzmann, Aaron, Seitz, Steven M. Example-Based Stereo with General BRDFs, 8th European Conference on Computer Vision (ECCV 2004), Prague, Czech Republic, May 2004.
- On-line documents:
Video-Based Document Tracking: Unifying Your Physical and Electronic Desktops
- Abstract:
- This paper presents an approach for tracking paper documents on the desk over time and automatically linking them to the corresponding electronic documents using an overhead video camera. We demonstrate our system in the context of two scenarios, paper tracking and photo sorting. In the paper tracking scenario, the system tracks changes in the stacks of printed documents and books on the desk and builds a complete representation of the spatial structure of the desktop. When users want to nd a printed document buried in the stacks, they can query the system based on appearance, keywords, or access time. The system also provides a remote desktop interface for directly browsing the physical desktop from a remote location. In the photo sorting scenario, users sort printed photographs into physical stacks on the desk. The system automatically recognizes the photographs and organizes the corresponding digital photographs into separate folders according to the physical arrangement. Our framework provides a way to unify the physical and electronic desktops without the need for a specialized physical infrastructure except for a video camera.
- Citation:
- Kim, Jiwon, Seitz, Steven M. and Agrawala, Maneesh. Video-Based Document Tracking: Unifying Your Physical and Electronic Desktops, UIST 2004, Santa Fe, New Mexico, USA, October 2004.
- On-line documents:
Project Page
Momentum-based Parameterization of Dynamic Character Motion
- Abstract:
- This paper presents a system for rapid editing of highly dynamic motion capture data. The heart of this system is an optimization algorithm that can transform the captured motion so that it satisfies high-level user constraints while enforcing that the linear and angular momentum of the motion remain physically plausible. Unlike most previous approaches to motion editing, our algorithm does not require pose specification or model reduction, and the user only need specify high-level changes to the input motion. To preserve the similar dynamic behavior of the input motion, we introduce a spline-based parameterization that matches the linear and angular momentum pattern of the motion capture data. Because our algorithm enables rapid convergence by presenting a good initial state of the optimization, the user can efficiently generate a large family of realistic motions from a single input motion. The algorithm can then populate the dynamic space of motions by simple interpolation, effectively parameterizing the space of realistic motions. We show how this framework can be used to produce an effective interface for rapid creation of dynamic animations, as well as to drive the dynamic motion of a character in real-time.
- Citation:
- Abe, Y., Liu, C. K., Popović, Z.. Momentum-based Parameterization of Dynamic Character Motion, ACM SIGGRAPH / Eurographics Symposium on Computer Animation, August 2004.
- On-line documents:
Project page
Flow-based Video Synthesis and Editing
- Abstract:
- This paper presents a novel algorithm for synthesizing and editing video of natural phenomena that exhibit continuous flow patterns. The algorithm analyzes the motion of textured particles in the input video along user-specified flow lines, and synthesizes seamless video of arbitrary length by enforcing temporal continuity along a second set of user-specified flow lines. The algorithm is simple to implement and use. We used this technique to edit video of waterfalls, rivers, flames, and smoke.
- Citation:
- Bhat, Kiran S., Seitz, Steven M., Hodgins, Jessica K., Khosla, Pradeep K.. Flow-based Video Synthesis and Editing, ACM Transactions on Graphics 23(3), July 2004.
- On-line documents:
- PDF (2.6MB)
Project page
Video Tooning
- Abstract:
- We describe a system for transforming an input video into a highly abstracted, spatio-temporally coherent cartoon animation with a range of styles. To achieve this, we treat video as a space-time volume of image data. We have developed an anisotropic kernel mean shift technique to segment the video data into contiguous volumes. These provide a simple cartoon style in themselves, but more importantly provide the capability to semi-automatically rotoscope semantically meaningful regions.
In our system, the user simply outlines objects on keyframes. A mean shift guided interpolation algorithm is then employed to create three dimensional semantic regions by interpolation between the keyframes, while maintaining smooth trajectories along the time dimension. These regions provide the basis for creating smooth two dimensional edge sheets and stroke sheets embedded within the spatio-temporal video volume. The regions, edge sheets, and stroke sheets are rendered by slicing them at particular times. A variety of styles of rendering are shown. The temporal coherence provided by the smoothed semantic regions and sheets results in a temporally consistent non-photorealistic appearance.
- Citation:
- Wang, Jue, Xu, Yingqing, Shum, Heung-Yeung, Cohen, Michael F. Video Tooning, ACM Transactions on Graphics 23(3), July 2004.
- On-line documents:
- PDF (4.0MB)
Spacetime Faces: High-Resolution Capture for Modeling and Animation
- Abstract:
- We present an end-to-end system that goes from video sequences to high resolution, editable, dynamically controllable face models. The capture system employs synchronized video cameras and structured light projectors to record videos of a moving face from multiple viewpoints. A novel spacetime stereo algorithm is introduced to compute depth maps accurately and overcome over-fitting deficiencies in prior work. A new template fitting and tracking procedure fills in missing data and yields point correspondence across the entire sequence without using markers. We demonstrate a data-driven, interactive method for inverse kinematics that draws on the large set of fitted templates and allows for posing new expressions by dragging surface points directly. Finally, we describe new tools that model the dynamics in the input sequence to enable new animations, created via key-framing or texture-synthesis techniques.
- Citation:
- Zhang, Li, Snavely, Noah, Curless, Brian, Seitz, Steven M.. Spacetime Faces: High-Resolution Capture for Modeling and Animation. ACM Transactions on Graphics 23(3), July 2004.
- On-line documents:
- PDF (10.3MB)
Project page
Fluid Control using the Adjoint Method
- Abstract:
- We describe a novel method for controlling physics-based fluid simulations through gradient-based nonlinear optimization. Using a technique known as the adjoint method, derivatives can be computed efficiently, even for large 3D simulations with millions of control parameters. In addition, we introduce the first method for the full control of free-surface liquids. We show how to compute adjoint derivatives through each step of the simulation, including the fast marching algorithm, and describe a new set of control parameters specifically designed for liquids.
- Citation:
- McNamara, Antoine, Treuille, Adrien, Popović, Zoran, Stam, Jos. Fluid Control using the Adjoint Method, ACM Transactions on Graphics 23(3), July 2004.
- On-line documents:
- PDF (4.0MB)
Project page
Interactive Digital Photomontage
- Abstract:
- We describe an interactive, computer-assisted framework for combining parts of a set of photographs into a single composite picture, a process we call "digital photomontage." Our framework makes use of two techniques primarily: graph-cut optimization, to choose good seams within the constituent images so that they can be combined as seamlessly as possible; and gradient-domain fusion, a process based on Poisson equations, to further reduce any remaining visible artifacts in the composite. Also central to the framework is a suite of interactive tools that allow the user to specify a variety of high-level image objectives, either globally across the image, or locally through a painting-style interface. Image objectives are applied independently at each pixel location and generally involve a function of the pixel values (such as "maximum contrast") drawn from that same location in the set of source images. Typically, a user applies a series of image objectives iteratively in order to create a finished composite. The power of this framework lies in its generality; we show how it can be used for a wide variety of applications, including "selective composites" (for instance, group photos in which everyone looks their best), relighting, extended depth of field, panoramic stitching, clean-plate production, stroboscopic visualization of movement, and time-lapse mosaics.
- Citation:
- Agarwala, Aseem, Dontcheva, Mira, Agrawala, Maneesh, Drucker, Steven, Colburn, Alex, Curless, Brian, Salesin, David H., Cohen, Michael. Interactive Digital Photomontage, ACM Transactions on Graphics 23(3), July 2004.
- On-line documents:
- PDF (6.0MB)
Project page
Keyframe-Based Tracking for Rotoscoping and Animation
- Abstract:
- We describe a new approach to rotoscoping --- the process of tracking contours in a video sequence --- that combines computer vision with user interaction. In order to track contours in video, the user specifies curves in two or more frames; these curves are used as keyframes by a computer-vision-based tracking algorithm. The user may interactively refine the curves and then restart the tracking algorithm. Combining computer vision with user interaction allows our system to track any sequence with significantly less effort than interpolation-based systems --- and with better reliability than pure computer vision systems. Our tracking algorithm is cast as a spacetime optimization problem that solves for time-varying curve shapes based on an input video sequence and user-specified constraints. We demonstrate our system with several rotoscoped examples. Additionally, we show how these rotoscoped contours can be used to help create cartoon animation by attaching user-drawn strokes to the tracked contours.
- Citation:
- Agarwala, Aseem, Hertzmann, Aaron, Salesin, David H., Seitz, Steven. Keyframe-Based Tracking for Rotoscoping and Animation, ACM Transactions on Graphics 23(3), July 2004.
- On-line documents:
- PDF (2.4MB)
Project page
Style-based Inverse Kinematics
- Abstract:
- We present an inverse kinematics system based on a learned model of human poses. Given a set of constraints, our system can produce the most likely pose satisfying those constraints, in realtime. Training the model on different input data leads to different styles of IK. The model is represented as a probability distribution over the space of all possible poses. This means that our IK system can generate any pose, but prefers poses that are most similar to the space of poses in the training data. We represent the probability with a novel model called a Scaled Gaussian Process Latent Variable Model. The parameters of the model are all learned automatically; no manual tuning is required for the learning component of the system. We additionally describe a novel procedure for interpolating between styles.
Our style-based IK can replace conventional IK, wherever it is used in computer animation and computer vision. We demonstrate our system in the context of a number of applications: interactive character posing, trajectory keyframing, real-time motion capture with missing markers, and posing from a 2D image.
- Citation:
- Grochow, Keith, Martin, Steven L., Hertzmann, Aaron, and Popović, Zoran. Style-based Inverse Kinematics, ACM Transactions on Graphics 23(3), July 2004.
- On-line documents:
- PDF (1.4MB)
Project page
On Creating Animated Presentations
- Abstract:
- Computers are used to display visuals for millions of live presentations each day, and yet only the tiniest fraction of these make any real use of the powerful graphics hardware available on virtually all of today s machines. In this paper, we describe our efforts toward harnessing this power to create better types of presentations: presentations that include meaningful animation as well as at least a limited degree of interactivity. Our approach has been iterative, alternating between creating animated talks using available tools, then improving the tools to better support the kinds of talk we wanted to make. Through this cyclic design process, we have identified a set of common authoring paradigms that we believe a system for building animated presentations should support. We describe these paradigms and present the latest version of our script-based system for creating animated presentations, called SLITHY. We show several examples of actual animated talks that were created and given with versions of SLITHY, including one talk presented at SIGGRAPH 2000 and four talks presented at SIGGRAPH 2002. Finally, we describe a set of design principles that we have found useful for making good use of animation in presentation.
- Citation:
- Zongker, Douglas E. and Salesin, David H.. On Creating Animated Presentations, Eurographics / ACM SIGGRAPH Symposium on Computer Animation, July 2003.
- On-line documents:
- PDF (1.0MB)
Adaptive Grid-Based Document Layout
- Abstract:
- Grid-based page designs are ubiquitous in commercially printed publications, such as newspapers and magazines. Yet, to date, no one has invented a good way to easily and automatically adapt such designs to arbitrarily-sized electronic displays. The difficult of generalizing grid-based designs explains the generally inferior nature of on-screen layouts when compared to their printed counterparts, and is arguably one of the greatest remaining impediments to creating on-line reading experiences that rival those of ink on paper. In this work, we present a new approach to adaptive grid-based document layout, which attempts to bridge this gap. In our approach, an adaptive layout style is encoded as a set of grid-based templates that know how to adapt to a range of page sizes and other viewing conditions. These templates include various types of layout elements (such as text, figures, etc.) and define, through constraint-based relationships, just how these elements are to be laid out together as a function of both the properties of the content itself, such as a figure's size and aspect ratio, and the properties of the viewing conditions under which the content is being displayed. We describe an XML-based representation for our templates and content, which maintains a clean separation between the two. We also describe the various parts of our research prototype system: a layout engine for formatting the page; a paginator for determining a globally optimal allocation of content amongst the pages; and a graphical user interface for interactively creating adaptive templates. We also provide numerous examples demonstrating the capabilities of this prototype, including this paper, itself, which has been laid out with our system.
- Citation:
- Jacobs, C., Li, W., Schrier, E., Bargeron, D., and Salesin, D.. Adaptive Grid-Based Document Layout, ACM Transactions on Graphics 22(3) (Proceedings of ACM SIGGRAPH 2003), July 2003, pp. 838-847.
- On-line documents:
- PDF (8.6MB)
Shape and Motion under Varying Illumination: Unifying Structure from Motion, Photometric Stereo, and Multi-view Stereo
- Abstract:
- This paper presents an algorithm for computing optical flow, shape, motion, lighting, and albedo from an image sequence of a rigidly-moving Lambertian object under distant illumination. The problem is formulated in a manner that subsumes structure from motion, multi-view stereo, and photometric stereo as special cases. The algorithm utilizes both spatial and temporal intensity variation as cues: the former constrains flow and the latter constrains surface orientation; combining both cues enables dense reconstruction of both textured and texture-less surfaces. The algorithm works by iteratively estimating affine camera parameters, illumination, shape, and albedo in an alternating fashion. Results are demonstrated on videos of hand-held objects moving in front of a fixed light and camera.
- Citation:
- Zhang, L., Curless, B., Hertzmann, A. and Seitz, Steven M.. Shape and Motion under Varying Illumination: Unifying Structure from Motion, Photometric Stereo, and Multi-view Stereo, Proceedings of the 9th IEEE International Conference on Computer Vision (ICCV), Nice France, October 2003.
- On-line documents:
Project Web Page
A Sketching Interface for Articulated Animation
- Abstract:
- We introduce a new interface for rapidly creating 3D articulated figure animation, from 2D sketches of the character in the desired key frame poses. Since the exact 3D pose corresponding to a 2D drawing is ambiguous we first reconstruct a set of possible 3D configurations and then apply a set of constraints and assumptions to present the user with the most likely 3D pose. The user can refine this candidate pose by choosing among alternate poses proposed by the system. This interface is supported by pose reconstruction and optimization methods specifically designed to work with imprecise hand drawn figures. Our system provides a simple, intuitive and fast interface for creating rough animations that leverages our users existing ability to draw. The resulting key framed sequence can be exported to a commercial animation packages for interpolation and additional refinement.
- Citation:
- Davis, J., Agrawala, M., Chuang, E., Popović, Z. and Salesin, David H. 2003 A Sketching Interface for Articulated Animation, Eurographics / ACM SIGGRAPH Symposium on Computer Animation.
- On-line documents:
- PDF (3.5MB)
Project Web Page
Estimating Cloth Simulation Parameters from Video
- Abstract:
- Cloth simulations are notoriously difficult to tune due to the many parameters that must be adjusted to achieve the look of a particular fabric. In this paper, we present an algorithm for estimating the parameters of a cloth simulation from video data of real fabric. A perceptually motivated metric based on matching between folds is used to compare video of real cloth with simulation. This metric compares two video sequences of cloth and returns a number that measures the differences in their folds. Simulated annealing is used to minimize the frame by frame error between the metric for a given simulation and the real-world footage. To estimate all the cloth parameters, we identify simple static and dynamic calibration experiments that use small swatches of the fabric. To demonstrate the power of this approach, we use our algorithm to find the parameters for four different fabrics. We show the match between the video footage and simulated motion on the calibration experiments, on new video sequences for the swatches, and on a simulation of a full skirt.
- Citation:
- Bhat, K. S., Twigg, C. D., Hodgins, J. K., Khosla, P. K., Popović, Z. and Seitz, Steven M. 2003. Estimating Cloth Simulation Parameters from Video, Eurographics / ACM SIGGRAPH Symposium on Computer Animation.
- On-line documents:
- PDF (7.5MB PDF)
Project Web Page
Layered Acting for Character Animation
- Abstract:
- We introduce an acting-based animation system for creating and editing character animation at interactive speeds. Our system requires minimal training, typically under an hour, and is well suited for rapidly prototyping and creating expressive motion. A real-time motion-capture framework records the user's motions for simultaneous analysis and playback on a large screen. The animator's real-world, expressive motions are mapped into the character's virtual world. Visual feedback maintains a tight coupling between the animator and character. Complex motion is created by layering multiple passes of acting. We also introduce a novel motion-editing technique, which derives implicit relationships between the animator and character. The animator mimics some aspect of the character motion, and the system infers the association between features of the animator's motion and those of the character. The animator modifies the mimic by acting again, and the system maps the changes onto the character. We demonstrate our system with several examples.
- Citation:
- Dontcheva, M., Yngve, G. and Popović, Z. Layered Acting for Character Animation. ACM Transactions on Graphics (ACM SIGGRAPH 2003).
- On-line documents:
Project Web Page
![]()
Bird Flight
- Abstract:
- In this paper we describe a physics-based method for synthesis of bird flight animations. Our method computes a realistic set of wingbeats that enables a bird to follow the specified trajectory. We model the bird as an articulated skeleton with elastically deformable feathers. The bird motion is created by applying joint torques and aerodynamic forces over time in a forward dynamics simulation. We solve for each wingbeat motion separately by optimizing for wingbeat parameters that create the most natural motion. The final animation is constructed by concatenating a series of optimal wingbeats. This detailed bird flight model enables us to produce flight motions of different birds performing a variety of maneuvers including taking off, cruising, rapidly descending, turning, and landing.
- Citation:
- Wu, Jia-Chi and Zoran Popović. 2003. Realistic Modeling of Bird Flight Animations, ACM Transactions on Graphics (ACM SIGGRAPH 2003).
- On-line documents:
- PDF (3.6MB)
Project Web Page
Keyframe Control of Smoke Simulations
- Abstract:
- We describe a method for controlling smoke simulations through user-specified keyframes. To achieve the desired behavior, a continuous quasi-Newton optimization solves for appropriate "wind" forces to be applied to the underlying velocity field throughout the simulation. The cornerstone of our approach is a method to efficiently compute exact derivatives through the steps of a fluid simulation. We formulate an objective function corresponding to how well a simulation matches the user's keyframes, and use the derivatives to solve for force parameters that minimize this function. For animations with several keyframes, we present a novel multipleshooting approach. By splitting large problems into smaller overlapping subproblems, we greatly speed up the optimization process while avoiding certain local minima.
- Citation:
- Treuille, A., McNamara, A., Popović, Z. and Stam, J. 2003. Keyframe Control of Smoke Simulations, ACM Transactions on Graphics (ACM SIGGRAPH 2003).
- On-line documents:
- PDF (0.9 MB)
Project web page
Shadow Matting and Compositing
- Abstract:
- In this paper, we describe a method for extracting shadows from one natural scene and inserting them into another. We develop physically-based shadow matting and compositing equations and use these to pull a shadow matte from a source scene in which the shadow is cast onto an arbitrary planar background. We then acquire the photometric and geometric properties of the target scene by sweeping oriented linear shadows (cast by a straight object) across it. From these shadow scans, we can construct a shadow displacement map without requiring camera or light source calibration. This map can then be used to deform the original shadow matte. We demonstrate our approach for both indoor scenes with controlled lighting and for outdoor scenes using natural lighting.
- Citation:
- CHUANG, Y.-Y., GOLDMAN, D. B, CURLESS, B., SALESIN, D. H.. and SZELISKI, R. 2003. Shadow Matting and Compositing, ACM Transactions on Graphics (ACM SIGGRAPH 2003).
- On-line documents:
- PDF (1.8 MB)
- Project web page
The space of human body shapes: reconstruction and parameterization from range scans
- Abstract:
- We develop a novel method for fitting high-resolution template meshes to detailed human body range scans with sparse 3D markers. We formulate an optimization problem in which the degrees of freedom are an affine transformation at each template vertex. The objective function is a weighted combination of three measures: proximity of transformed vertices to the range data, similarity between neighboring transformations, and proximity of sparse markers at corresponding locations on the template and target surface. We solve for the transformations with a non-linear optimizer, run at two resolutions to speed convergence. We demonstrate reconstruction and consistent parameterization of 250 human body models. With this parameterized set, we explore a variety of applications for human body modeling, including: morphing, texture transfer, statistical analysis of shape, model fitting from sparse markers, feature analysis to modify multiple correlated parameters (such as the weight and height of an individual), and transfer of surface detail and animation controls from a template to fitted models.
- Citation:
- ALLEN, B., CURLESS, B., and POPOVIĆ, Z. 2003. The space of human body shapes: reconstruction and parameterization from range scans, ACM Transactions on Graphics (ACM SIGGRAPH 2003).
- On-line documents:
- Paper web page
- PDF (6.3 MB)
- Project web page
Shape and Materials by Example: A Photometric Stereo Approach
- Abstract:
- This paper presents a technique for computing the geometry of objects with general reflectance properties from images. For surfaces with varying material properties, a full segmentation into different material types is also computed. It is assumed that the camera viewpoint is fixed, but the illumination varies over the input sequence. It is also assumed that one or more example objects with similar materials and known geometry are imaged under the same illumination conditions. Unlike most previous work in shape reconstruction, this technique can handle objects with arbitrary and spatially-varying BRDFs. Furthermore, the approach works for arbitrary distant and unknown lighting environments. Finally, almost no calibration is needed, making the approach exceptionally simple to apply.
- Citation:
- Aaron Hertzmann, Steven M. Seitz, Proceedings of CVPR 2003.
- On-line documents:
- Project web page
Spacetime Stereo: Shape Recovery for Dynamic Scenes
- Abstract:
- This paper extends the traditional binocular stereo problem into the spacetime domain, in which a pair of video streams is matched simultaneously instead of matching pairs of images frame by frame. Almost any existing stereo algorithm may be extended in this manner simply by replacing the image matching term with a spacetime term. By utilizing both spatial and temporal appearance variation, this modification reduces ambiguity and increases accuracy. Three major applications for spacetime stereo are proposed in this paper. First, spacetime stereo serves as a general framework for structured light scanning and generates high quality depth maps for static scenes. Second, spacetime stereo is effective for a class of natural scenes, such as waving trees and flowing water, which have repetitive textures and chaotic behaviors and are challenging for existing stereo algorithms. Third, the approach is one of very few existing methods that can robustly reconstruct objects that are moving and deforming over time, achieved by use of oriented spacetime windows in the matching procedure. Promising experimental results in the above three scenarios are demonstrated.
- Citation:
- Li Zhang, Brian Curless, and Steven M. Seitz, Proceedings of CVPR 2003.
- On-line documents:
- Complete article [Acrobat pdf file]
Project Web Page
View-dependent refinement of multiresolution meshes with subdivision connectivity
Abstract:
- We present a view-dependent level-of-detail algorithm for triangle meshes with subdivision connectivity. The algorithm is more suitable for textured meshes of arbitrary topology than existing progressive mesh-based schemes. It begins with a wavelet decomposition of the mesh, and, per frame, finds a partial sum of wavelets necessary for high-quality renderings from that frame's viewpoint. We present a screen-space error metric that measures both geometric and texture deviation and tends to outperform prior error metrics developed for progressive meshes. In addition, wavelets that lie outside the view frustum or in backfacing areas are eliminated. The algorithm takes advantage of frame-to-frame coherence for improved performance and supports geomorphs for smooth transitions between levels of detail.
- Citation:
- Daniel I. Azuma, Daniel N. Wood, Brian Curless, Tom Duchamp, David H. Salesin, and Werner Stuetzle, Proceedings of AFRIGRAPH 2003.
- On-line documents:
- Complete article [Acrobat pdf file]
Single View Modeling of Free-Form Scenes
- Abstract:
- This paper presents a novel approach for reconstructing free-form, texture-mapped, 3D scene models from a single painting or photograph. Given a sparse set of user-specified constraints on the local shape of the scene, a smooth 3D surface that satisfies the constraints is generated. This problem is formulated as a constrained variational optimization problem. In contrast to previous work in single view reconstruction, our technique enables high quality reconstructions of free-form curved surfaces with arbitrary reflectance properties. A key feature of the approach is a novel hierarchical transformation technique for accelerating convergence on a non-uniform, piecewise continuous grid. The technique is interactive and updates the model in real time as constraints are added, allowing fast reconstruction of photorealistic scene models. The approach is shown to yield high quality results on a large variety of images.
- Citation:
- L. Zhang, G. Dugas-Phocion, J.-S. Samson, and S. M. Seitz, Proceedings of CVPR 2001.
L. Zhang, G. Dugas-Phocion, J.-S. Samson, and S. M. Seitz, Journal of Visualization and Computer Animation, 2002, (Invited paper).
- On-line documents:
- Project web page
Curve Analogies
- Abstract:
- This paper describes a method for learning statistical models of 2D curves, and shows how these models can be used to design line art rendering styles by example. A user can create a new style by providing an example of the style, e.g. by sketching a curve in a drawing program. Our method can then synthesize random new curves in this style, and modify existing curves to have the same style as the example. This method can incorporate position constraints on the resulting curves.
- Citation:
- Aaron Hertzmann, Nuria Oliver, Brian Curless, and Steven M. Seitz. 13th Eurographics Workshop on Rendering, Pisa, Italy, June 26-28, 2002.
- On-line documents:
- Complete article [Acrobat pdf file]
Rapid Shape Acquisition Using Color Structured Light and Multi-pass Dynamic Programming
- Abstract:
- This paper presents a color structured light technique for recovering object shape from one or more images. The technique works by projecting a pattern of stripes of alternating colors and matching the projected color transitions with observed edges in the image. The correspondence problem is solved using a novel, multi-pass dynamic programming algorithm that eliminates global smoothness assumptions and strict ordering constraints present in previous formulations. The resulting approach is suitable for generating both high-speed scans of moving objects when projecting a single stripe pattern and high-resolution scans of static scenes using a short sequence of time-shifted stripe patterns. In the latter case, spacetime analysis is used at each sensor pixel to obtain inter-frame depth localization. Results are demonstrated for a variety of complex scenes.
- Citation:
- Li Zhang, Brian Curless, and Steven M. Seitz. 1st international symposium on 3D data processing, visualization, and transmission, Padova, Italy, June 19-21, 2002.
- On-line documents:
- Complete article [Acrobat pdf file]
Project page
Articulated Body Deformation from Range Scan Data
- Abstract:
- This paper presents an example-based method for calculating skeleton-driven body deformations. Our example data consists of range scans of a human body in a variety of poses. Using markers captured during range scanning, we construct a kinematic skeleton and identify the pose of each scan. We then construct a mutually consistent parameterization of all the scans using a posable subdivision surface template. The detail deformations are represented as displacements from this surface, and holes are filled smoothly within the displacement maps. Finally, we combine the range scans using k-nearest neighbor interpolation in pose space. We demonstrate results for a human upper body with controllable pose, kinematics, and underlying surface shape.
- Citation:
- Brett Allen, Brian Curless, and Zoran Popović. Proceedings of SIGGRAPH 2002, in Computer Graphics Proceedings, Annual Conference Series, 2002.
- On-line documents:
- Complete article [Acrobat pdf file]
Interactive Skeleton-Driven Dynamic Deformations
- Abstract:
- This paper presents a framework for the skeleton-driven animation of elastically deformable characters. A character is embedded in a coarse volumetric control lattice, which provides the structure needed to apply the finite element method. To incorporate skeletal controls, we introduce line constraints along the bones of simple skeletons. The bones are made to coincide with edges of the control lattice, which enables us to apply the constraints efficiently using algebraic methods. To accelerate computation, we associate regions of the volumetric mesh with particular bones and perform locally linearized simulations, which are blended at each time step. We define a hierarchical basis on the control lattice, so for detailed interactions the simulation can adapt the level of detail. We demonstrate the ability to animate complex models using simple skeletons and coarse volumetric meshes in a manner that simulates secondary motions at interactive rates.
- Citation:
- Steve Capell, Seth Green, Brian Curless, Tom Duchamp, and Zoran Popović. Proceedings of SIGGRAPH 2002, in Computer Graphics Proceedings, Annual Conference Series, 2002.
- On-line documents:
- Complete article (PDF, 1.7MB)
Project web page
A Multiresolution Framework for Dynamic Deformations
- Abstract:
- We present a novel framework for the dynamic simulation of elastic deformable solids. Our approach combines classical finite element methodology with a multiresolution subdivision framework in order to produce fast, easy to use, and realistic animations. We represent deformations using a hierarchical basis constructed using volumetric subdivision. The subdivision framework provides topological flexibility and the hierarchical basis allows the simulation to add detail where it is needed. Since volumetric parameterization is difficult for complex models, we support the embedding of objects in domains that are easier to parameterize.
- Citation:
- Steve Capell, Seth Green, Brian Curless, Tom Duchamp, and Zoran Popović. Proceedings of ACM SIGGRAPH Symposium on Computer Animation, 2002.
- On-line documents:
- Complete article (PDF, 0.7MB)
Project web page
Video Matting of Complex Scenes
- Abstract:
- This paper describes a new framework for video matting, the process of pulling a high-quality alpha matte and foreground from a video sequence. The framework builds upon techniques in natural image matting, optical flow computation, and background estimation. User interaction is comprised of garbage matte specification if background estimation is needed, and hand-drawn keyframe segmentations into "foreground," "background," and "unknown". The segmentations, called trimaps, are interpolated across the video volume using forward and backward optical flow. Competing flow estimates are combined based on information about where flow is likely to be accurate. A Bayesian matting technique uses the flowed trimaps to yield high-quality mattes of moving foreground elements with complex boundaries filmed by a moving camera. A novel technique for smoke matte extraction is also demonstrated.
- Citation:
- Yung-Yu Chuang, Aseem Agarwala, Brian Curless, David H. Salesin, and Richard Szeliski. Proceedings of SIGGRAPH 2002, in Computer Graphics Proceedings, Annual Conference Series, 2002.
- On-line documents:
- Complete article [Acrobat pdf file]
- Project web page
Synthesis of Complex Dynamic Character Motion from Simple Animations
- Abstract:
- In this paper we present a general method for rapid prototyping of realistic character motion. We solve for the natural motion from a simple animation provided by the animator. Our framework can be used to produce relatively complex realistic motion with little user effort. We describe a novel constraint detection method that automatically determines different constraints on the character by analyzing the input motion. We show that realistic motion can be achieved by enforcing a small set of linear and angular momentum constraints. This simplified approach helps us avoid the complexities of computing muscle forces. Simpler dynamic constraints also allow us to generate animations of models with greater complexity, performing more intricate motions. Finally, we show that by learning a small set of key parameters that describe a character pose we can help a non-skilled animator rapidly create realistic character motion.
- Citation:
- C. Karen Liu and Zoran Popović. Proceedings of SIGGRAPH 2002, in Computer Graphics Proceedings, Annual Conference Series, 2002.
- On-line documents:
- Complete article [Acrobat pdf file]
The Space of All Stereo Images
- Abstract:
- A theory of stereo image formation is presented that enables a complete classification of all possible stereo views, including non-perspective varieties. Towards this end, the notion of epipolar geometry is generalized to apply to multiperspective images. It is shown that any stereo pair must consist of rays lying on one of three varieties of quadric surfaces. A unified representation is developed to model all classes of stereo views, based on the concept of a quadric view. The benefits include a unified treatment of projection and triangulation operations for all stereo views. The framework is applied to derive new types of stereo image representations with unusual and useful properties. Experimental examples of these images are constructed and used to obtain 3D binocular object reconstructions.
- Citation:
- Steven M. Seitz and Jiwon Kim. Marr Prize Special Issue, IJCV 2001. First published in ICCV 2001.
- On-line documents:
- Project web page
Image Analogies
- Abstract:
- This paper describes a new framework for processing images by example, called "image analogies." The framework involves two stages: a design phase, in which a pair of images, with one image purported to be a "filtered" version of the other, is presented as training data; and an application phase, in which the learned filter is applied to some new target image in order to create an "analogous" filtered result. Image analogies are based on a simple multi-scale autoregression, inspired primarily by recent results in texture synthesis. By choosing different types of source image pairs as input, the framework supports a wide variety of "image filter" effects, including traditional image filters, such as blurring or embossing; improved texture synthesis, in which some textures are synthesized with higher quality than by previous approaches; super-resolution, in which a higher-resolution image is inferred from a low-resolution source; texture transfer, in which images are "texturized" with some arbitrary source texture; artistic filters, in which various drawing and painting styles are synthesized based on scanned real-world examples; and texture-by-numbers, in which realistic scenes, composed of a variety of textures, are created using a simple painting interface.
- Citation:
- Aaron Hertzmann, Charles E. Jacobs, Nuria Oliver, Brian Curless, and David H. Salesin. Proceedings of SIGGRAPH 2001, in Computer Graphics Proceedings, Annual Conference Series, 2001.
- On-line documents:
- Project web page
A Bayesian Approach to Digital Matting
- Abstract:
- This paper proposes a new Bayesian framework for solving the matting problem, i.e. extracting a foreground element from a background image by estimating an opacity for each pixel of the foreground element. Our approach models both the foreground and background color distributions with spatially-varying sets of Gaussians, and assumes a fractional blending of the foreground and background colors to produce the final output. It then uses a maximum-likelihood criterion to estimate the optimal opacity, foreground and background simultaneously. In addition to providing a principled approach to the matting problem, our algorithm effectively handles objects with intricate boundaries, such as hair strands and fur, and provides an improvement over existing techniques for these difficult cases.
- Citation:
- Yung-Yu Chuang, Brian Curless, David H. Salesin, and Richard Szeliski. Proceedings of CVPR 2001.
- On-line documents:
- Complete article [Acrobat pdf file]
- Project web page
Interactive Control of Rigid Body Simulations
- Abstract:
- Physical simulation of dynamic objects has become commonplace in computer graphics because it produces highly realistic animations. In this paradigm the animator provides few physical parameters such as the objects' initial positions and velocities, and the simulator automatically generates realistic motions. The resulting motion, however, is difficult to control because even a small adjustment of the input parameters can drastically affect the subsequent motion. Furthermore, the animator often wishes to change the end-result of the motion instead of the initial physical parameters. We describe a novel interactive technique for intuitive manipulation of rigid multi-body simulations. Using our system, the animator can select bodies at any time and simply drag them to desired locations. In response, the system computes the required physical parameters and simulates the resulting motion. Surface characteristics such as normals and elasticity coefficients can also be automatically adjusted to provide a greater range of feasible motions, if the animator so desires. Because the entire simulation editing process runs at interactive speeds, the animator can rapidly design complex physical animations that would be difficult to achieve with existing rigid body simulators.
- Citation:
- Jovan Popovic, Steven M. Seitz, Michael Erdmann, Zoran Popovic, and Andrew Witkin. Proceedings of SIGGRAPH 2000, in Computer Graphics Proceedings, Annual Conference Series, 2000.
- On-line documents:
- Complete article [Acrobat pdf file]
The Digital Michelangelo Project: 3D Scanning of Large Statues
- Abstract:
- We describe a hardware and software system for digitizing the shape and color of large fragile objects under non-laboratory conditions. Our system employs laser triangulation rangefinders, laser time-of-flight rangefinders, digital still cameras, and a suite of software for acquiring, aligning, merging, and viewing scanned data. As a demonstration of this system, we digitized 10 statues by Michelangelo, including the well-known figure of David, two building interiors, and all 1,163 extant fragments of the Forma Urbis Romae, a giant marble map of ancient Rome. Our largest single dataset is of the David - 2 billion polygons and 7,000 color images. In this paper,we discuss the challenges we faced in building this system, the solutions we employed, and the lessons we learned. We focus in particular on the unusual design of our laser triangulation scanner and on the algorithms and software we developed for handling very large scanned models.
- Citation:
- Marc Levoy, Kari Pulli, Brian Curless, Szymon Rusinkiewicz, David Koller, Lucas Pereira, Matt Ginzton, Sean Anderson, James Davis, Jeremy Ginsberg, Jonathan Shade, and Duane Fulk. Proceedings of SIGGRAPH 2000, in Computer Graphics Proceedings, Annual Conference Series, 2000.
- On-line documents:
- Project web page
Video Textures
- Abstract:
- This paper introduces a new type of medium, called a video texture, which has qualities somewhere between those of a photograph and a video. A video texture provides a continuous infinitely varying stream of images. While the individual frames of a video texture may be repeated from time to time, the video sequence as a whole is never repeated exactly. Video textures can be used in place of digital photos to infuse a static image with dynamic qualities and explicit action. We present techniques for analyzing a video clip to extract its structure, and for synthesizing a new, similar looking video of arbitrary length. We combine video textures with view morphing techniques to obtain 3D video textures. We also introduce video-based animation, in which the synthesis of video textures can be guided by a user through high-level interactive controls. Applications of video textures and their extensions include the display of dynamic scenes on web pages, the creation of dynamic backdrops for special effects and games, and the interactive control of video-based animation.
- Citation:
- Arno Schödl, Richard Szeliski, David H. Salesin, and Irfan Essa. Proceedings of SIGGRAPH 2000, in Computer Graphics Proceedings, Annual Conference Series, 2000.
- On-line documents:
- Complete article [Acrobat pdf file]
- Project web page
Surface Light Fields for 3D Photography
- Abstract:
- A surface light field is a function that assigns a color to each ray originating on a surface. Surface light fields are well suited to constructing virtual images of shiny objects under complex lighting conditions. This paper presents a framework for construction, compression, interactive rendering, and rudimentary editing of surface light fields of real objects. Generalizations of vector quantization and principal component analysis are used to construct a compressed representation of an object's surface light field from photographs and range scans. A new rendering algorithm achieves interactive rendering of images from the compressed representation, incorporating view-dependent geometric level-of-detail control. The surface light field representation can also be directly edited to yield plausible surface light fields for small changes in surface geometry and reflectance properties.
- Citation:
- Daniel N. Wood, Daniel I. Azuma, Ken Aldinger, Brian Curless, Tom Duchamp, David H. Salesin, and Werner Stuetzle. Proceedings of SIGGRAPH 2000, in Computer Graphics Proceedings, Annual Conference Series, 2000.
- On-line documents:
- Complete article [Acrobat pdf file]
- Project web page
Escherization
- Abstract:
This paper introduces and presents a solution to the "Escherization" problem: given a closed figure in the plane, find a new closed figure that is similar to the original and tiles the plane. Our solution works by using a simulated annealer to optimize over a parameterization of the "isohedral" tilings, a class of tilings that is flexible enough to encompass nearly all of Escher's own tilings, and yet simple enough to be encoded and explored by a computer. We also describe a representation for isohedral tilings that allows for highly interactive viewing and rendering. We demonstrate the use of these tools -- along with several additional techniques for adding decorations to tilings -- with a variety of original ornamental designs.
- Citation:
- Craig S. Kaplan and David H. Salesin. Proceedings of SIGGRAPH 2000, in Computer Graphics Proceedings, Annual Conference Series, 2000.
- On-line documents:
- Complete article [Acrobat pdf file]
- Project web page
Environment Matting Extensions: Towards Higher Accuracy and Real-Time Capture
- Abstract:
- Environment matting is a generalization of traditional bluescreen matting. By photographing an object in front of a sequence of structured light backdrops, a set of approximate light-transport paths through the object can be computed. The original environment matting research chose a middle ground---using a moderate number of photographs to produce results that were reasonably accurate for many objects. In this work, we extend the technique in two opposite directions: recovering a more accurate model at the expense of using additional structured light backdrops, and obtaining a simplified matte using just a single backdrop. The first extension allows for the capture of complex and subtle interactions of light with objects, while the second allows for video capture of colorless objects in motion.
- Citation:
- Yung-Yu Chuang, Douglas E. Zongker, Joel Hindorff, Brian Curless, David H. Salesin, and Richard Szeliski. Proceedings of SIGGRAPH 2000, in Computer Graphics Proceedings, Annual Conference Series, 2000.
- On-line documents:
- Complete article [Acrobat pdf file, 1,458 Kb (hi-res figures)]
- Complete article [Acrobat pdf file, 423 Kb (lo-res figures)]
- Technical Report UW-CSE-2000-05-01 [Acrobat pdf file, 1,735 Kb] (SIGGRAPH paper + appendix)
- Project web page
Example-Based Hinting of TrueType Fonts
- Abstract:
- Hinting in TrueType is a time-consuming manual process in which a typographer creates a sequence of instructions for better fitting the characters of a font to a grid of pixels. In this paper, we propose a new method for automatically hinting TrueType fonts by transferring hints of one font to another. Given a hinted source font and a target font without hints, our method matches the outlines of corresponding glyphs in each font, and then translates all of the individual hints for each glyph from the source to the target font. It also translates the control value table (CVT) entries, which are used to unify feature sizes across a font. The resulting hinted font already provides a great improvement over the unhinted version. More importantly, the translated hints, which preserve the sound, hand-designed hinting structure of the original font, provide a very good starting point for a professional typographer to complete and fine-tune, saving time and increasing productivity. We demonstrate our approach with examples of automatically hinted fonts at typical display sizes and screen resolutions. We also provide estimates of the time saved by a professional typographer in hinting new fonts using this semi-automatic approach.
- Citation:
- Douglas E. Zongker, Geraldine Wade, and David H. Salesin. Proceedings of SIGGRAPH 2000, in Computer Graphics Proceedings, Annual Conference Series, 2000.
- On-line documents:
- Complete article [Acrobat pdf file, 280 Kb (hi-res figures)]
Environment Matting and Compositing
- Abstract:
- This paper introduces a new process, environment matting, which captures not just a foreground object and its traditional opacity matte from a real-world scene, but also a description of how that object refracts and reflects light, which we call an environment matte. The foreground object can then be placed in a new environment, using environment compositing, where it will refract and reflect light from that scene. Objects captured in this way exhibit not only specular but glossy and translucent effects, as well as selective attenuation and scattering of light according to wavelength. Moreover, the environment compositing process, which can be performed largely with texture mapping operations, is fast enough to run at interactive speeds on a desktop PC. We compare our results to photos of the same objects in real scenes. Applications of this work include the relighting of objects for virtual and augmented reality, more realistic 3D clip art, and interactive lighting design.
- Citation:
- Douglas E. Zongker, Dawn M. Werner, Brian Curless, and David H. Salesin. Proceedings of SIGGRAPH 99, in Computer Graphics Proceedings, Annual Conference Series, 1999.
- On-line documents:
- Complete article [Acrobat pdf file, 1,703 Kb (hi-res figures)]
- Complete article [Acrobat pdf file, 484 Kb (lo-res figures)]
- Project web page
Interactive Arrangement of Botanical L-System Models
- Abstract:
- In this paper, we explore the problem of interactively manipulating plant models without sacrificing their botanical accuracy. The primary technical contribution of the paper is a method for interactively manipulating plant structures using a inverse-kinematics optimization technique. The branches of the plant are endowed with flexural and torsional stiffnesses, and these are used in the IK optimization. We demonstrate our approach with several examples of plant models arranged in this fashion.
- Citation:
- Joanna L. Power, A. J. Bernheim Brush, David H. Salesin, and Przemyslaw Prusinkiewicz. 1999 ACM Symposium on Interactive 3D Graphics.
- On-line documents:
- Complete article [Acrobat pdf file, 120 Kb]
- Color Plate [Acrobat pdf file, 44 Kb]
Computer-Generated Floral Ornament
- Abstract:
- This paper describes some of the priniciples of traditional floral ornamental design, and explores ways in which these designs can be created algorithmically. It introduces the idea of "adaptive clip art," which encapsulates the rules for creating a specific ornamental pattern. Adaptive clip art can be used to generate patterns that are tailored to fit a particularly shaped region of the plane. If the region is resized or reshaped, the ornament can be automatically regenerated to fill this new area in an appropriate way. Our ornamental patterns are created in two steps: first, the geometry of the pattern is generated as a set of two-dimensional curves and filled boundaries: second, this geometry is rendered in any number of styles. We demonstrate our approach with a variety of floral ornamental designs.
- Citation:
- Michael T. Wong, Douglas E. Zongker, and David Salesin. Proceedings of SIGGRAPH 98, in Computer Graphics Proceedings, Annual Conference Series, 1998.
- On-line documents:
- Complete article [Acrobat pdf file, 9.7 Mb]
Layered Depth Images
- Abstract:
- In this paper we present a set of efficient image based rendering methods capable of rendering multiple frames per second on a PC. The first method warps Sprites with Depth representing smooth surfaces without the gaps found in other techniques. A second method for more general scenes performs warping from an intermediate representation called a Layered Depth Image (LDI). An LDI is a view of the scene from a single input camera view, but with multiple pixels along each line of sight. The size of the representation grows only linearly with the observed depth complexity in the scene. Moreover, because the LDI data are represented in a single image coordinate system, McMillan's warp ordering algorithm can be successfully adapted. As a result, pixels are drawn in the output image in back-to-front order. No z-buffer is required, so alpha-compositing can be done efficiently without depth sorting. This makes splatting an efficient solution to the resampling problem.
- Citation:
- Jonathan Shade, Steven J. Gortler, Li-wei He, and Richard Szeliski. Proceedings of SIGGRAPH 98, in Computer Graphics Proceedings, Annual Conference Series, 1998.
- On-line documents:
- Complete article [Acrobat pdf file, 841 Kb]
- Project web page
Reproducing Color Images Using Custom Inks
- Abstract:
- We investigate the general problem of reproducing color images on an offset press using custom inks in any combination and number. While this problem has been explored previously for the case of two inks, there are a number of new mathematical and algorithmic challenges that arise as the number of inks increases. These challenges include more complex gamut mapping strategies, more efficient ink selection strategies, and fast and numerically accurate methods for computing ink separations in situations that may be either over-or under-constrained. In addition, the demands of high-quality color printing require an accurate physical model of the colors that result from overprinting multiple inks using halftoning, including the effects of trapping, dot gain, and the interreflection of light between ink layers. In this paper, we explore these issues related to printing with multiple custom inks, and address them with new algorithms and physcial models. Finally, we present some printed examples demonstrating the promise of our methods.
- Citation:
- Eric J. Stollnitz, Victor Ostromoukhov, and David Salesin. Proceedings of SIGGRAPH 98, in Computer Graphics Proceedings, Annual Conference Series, 1998.
- On-line documents:
- Article without appendices [Acrobat pdf file, 221 Kb]
- Project web page
Synthesizing Realistic Facial Expressions from Photographs
- Abstract:
- We present new techniques for creating photorealistic textured 3D facial models from photographs of a human subject, and for creating smooth transitions between different facial expressions by morphing between these different models. Starting from several uncalibrated views of a human subject, we employ a user-assisted technique to recover the camera poses corresponding to the views as well as the 3D coordinates of a sparse set of chosen locations on the subject's face. A scattered data interpolation techniques is then used to deform a generic face mesh to fit the particular geometry of the subject's face. Having recovered the camera poses and the facial geometry, we extract from the input images one or more texture maps for the model. This process is reapeated for several facial expressions of a particular subject. To generate transitions between these facial expressions we use 3D shape morphing between the corresponding face modes, while at the same time blending the corresponding textures. Using our technique, we have been able to generate highly realistic face models and natural looking animations.
- Citation:
- Frederic Pighin, Jamie Hecker, Dani Lischinski, Richard Szeliski, and David Salesin. Proceedings of SIGGRAPH 98, in Computer Graphics Proceedings, Annual Conference Series, 1998.
- On-line documents:
- Complete article [Acrobat pdf file, 272 Kb]
- Also available as Department of Computer Science and Engineering Technical Report TR 97-01-03.
- Text only [compressed PostScript file, 80 Kb]
- Color plates [compressed PostScript file, 80 Kb]
- Project web page
Computer-Generated Watercolor
- Abstract:
- This paper describes the various artistic effects of watercolor and shows how they can be simulated automatically. Our watercolor model is based on an ordered set of translucent glazes, which are created independently using a shallow-water fluid simulation. We use a Kubelka-Munk compositing model for simulating the optical effect of the superimposed glazes. We demonstrate how computer-generated watercolor can be used in three different applications: as part of an interactive watercolor paint system, as a method for automatic image "watercolorization", and as a mechanism for non-photorealistic rendering of three-dimensional scenes.
- Citation:
- Cassidy Curtis, Sean Anderson, Josh Seims, Kurt Fleischer, and David H. Salesin. Proceedings of SIGGRAPH 97, in Computer Graphics Proceedings, Annual Conference Series, 1997.
- On-line documents:
- The paper (gzipped PostScript, 8.1 Mb)
- The paper (.pdf, 1.83 Mb)
- The paper (.pdf, 517 Kb, images compressed beyond recognition)
Multiperspective Panoramas for Cel Animation
- Abstract:
- Traditional 2D cel-animation uses background panoramas over which foreground characters, and the camera, move. Because characters move through complex worlds, the panorama often contains multiple views of the world taken from different perspectives, but nonetheless seamlessly integrated into a 2D painting that is locally coherent, but may be globally nonsensical. This is a difficult task in which computer graphics can be of service. The panorama-creation process is currently performed by specialists, and the complexity of camera paths through the world is limited by their ability to assemble multiple views into a coherent whole. Futhermore, once an artist has created the panorama it is often difficult to incorporate computer-generated imagery elements into the animation because it is hard to abstract a meaningful 3D geometry for the world. This paper presents a system that creates a layout-guide from a crude 3D model and a camera path through that model; this layout-guide is then used in the production of a panorama, but one in which complex paths are possible, and in which the incorporation of CG elements is simple.
- Citation:
- Daniel N. Wood, Adam Finkelstein, John F. Hughes, Craig E. Thayer, and David H. Salesin. Proceedings of SIGGRAPH 97, in Computer Graphics Proceedings, Annual Conference Series, 1997.
- On-line documents:
- Complete article [Acrobat pdf file, 2 Mb]
Orientable Textures for Image-Based Pen-and-Ink Illustration
- Abstract:
- We present an interactive system for creating pen-and-ink-style line drawings from greyscale images in which the strokes of the rendered illustation follow the features of the original image. The user, via new interaction techniques for editing a direction field, specifies an orientation for each region of the image; the computer draws oriented strokes, based on a user-specified set of example strokes, that achieve the same tone as the image via a new algorithm that compares an adaptively-blurred version of the current illustration to the target tone image. By aligning the direction field with surface orientation of the objects in the image the user can create textures that appear attached to those objects instead of merely converying their darkness. The result is a more compelling pen-and-ink illustration than was previously possible from 2D reference imagery.
- Citation:
- Michael P. Salisbury, Michael T. Wong, John F. Hughes, and David H. Salesin. Proceedings of SIGGRAPH 97, in Computer Graphics Proceedings, Annual Conference Series, 401-406, August 1997.
- On-line documents:
- Complete article [compressed PostScript file, 2.3 Mb]
Progressive Previewing of Ray-Traced Images Using Image Plane Discontinuity Meshing
- Abstract:
- This paper presents a new method for progressively previewing a ray-traced image while it is being computed. Our method constructs and incrementally updates a constrained Delaunay triangulation for the image plane containing various important discontinuity edges in the image along with all of the image samples that have been computed by the ray tracer. The triangulation is rendered using hardware Gourand shading, yielding a piecewise linear approximation to the final image. Texture mapped surfaces as well as other regions in the image that are not well approximated by linear interpolation, are handled threough the use of hardware texture mapping.
- Citation:
- F.P. Pighin, D. Lishinski, and D.H. Salesin. Eight Eurographics Workshop on Rendering, 115-125, Saint-Etienne, France, June 1997.
- On-line documents:
- Complete article [Acrobat pdf file, 1.2 MB]
Clustering for Glossy Global Illumination
- Abstract:
- We present a new clustering algorithm for global illumination in complex environments. The new algorithm extends previous work on clustering for radiosity to allow for non-diffuse (glossy) reflectors. We represent clusters as points with directional distributions of outgoing and incoming radiance and importance, and we derive an error bound for transfers between these clusters. The algorithm groups input surfaces into a hierarchy of clusters, and then permits clusters to interact only if the error bound is below an acceptable tolerance. We show that the algorithm is asymptotically more efficient than previous clustering algorithms even when restricted to ideally diffuse environments. Finally, we demonstrate the performance of our method on two complex glossy environments.
- Citation:
- Per H. Christensen, Dani Lischinski, Eric J. Stollnitz, and David H. Salesin. Clustering for glossy global illumination. ACM Transactions on Graphics 16, January 1997.
- On-line documents:
- Complete article [Acrobat file, 843 Kb]
- Complete article [compressed PostScript file, 9.1 Mb]
- Article without color images [compressed PostScript file, 92 Kb]
Comic Chat
- Abstract:
- Comics have a rich visual vocabulary, and people find them appealing. They are also an effective form of communication. We have built a system, called Comic Chat, that represents on-line communications in the form of comics. Comic Chat automates numerous aspects of comics generation, including balloon construction and layout, the placement and orientation of comic characters, the default selection of character gestures and expressions, the incorporation of semantic panel elements, and the choice of zoom factor for the virtual camera. This paper describes the mechanisms that Comic Chat uses to perform this automation, as well as novel aspects of the program's user interface. Comic Chat is a working program, allowing groups of people to communicate over the Internet. It has several advantages over other graphical chat programs, including the availability of a graphical history, and a dynamic graphical presentation.
- Citation:
- D. Kurlander, David H. Salesin, T. Skelly. Comic chat. Proceedings of SIGGRAPH 96, in Computer Graphics Proceedings, Annual Conference Series, 225-236, August 1996.
- On-line documents:
- Complete article [Acrobat file, 2.3 Mb]
Declarative Camera Control for Automatic Cinematography
- Abstract:
- Animations generated by interactive 3D computer graphics applications are typically portrayed either from a particular character's point of view or from a small set of strategically-placed viewpoints. By ignoring camera placement, such applications fail to realize important storytelling capabilities that have been explored by cinematographers for many years. In this paper, we describe several of the principles of cinematography and show how they can be formalized into a declarative language, called the Declarative Camera Control Language ( dccl ). We describe the application of dccl within the context of a simple interactive video game and argue that dccl represents cinematic knowledge at the same level of abstraction as expert directors by encoding 16 idioms from a film textbook. These idioms produce compelling animations, as demonstrated on the accompanying videotape.
- Citation:
- David B. Christianson, Sean. E. Anderson, Li-Wei He, Daniel S. Weld, Michael F. Cohen, David H. Salesin. Declarative camera control for automatic cinematography. Proceedings of AAAI '96 (Portland, OR), 148-155, 1996. An earlier version appeared as Department of Computer Science and Engineering Technical Report TR 95-01-03, University of Washington, 1995.
- On-line documents:
- Complete article [Acrobat file, 240 KB]
Fast Rendering of Complex Environments
Using a Spatial Hierarchy
- Abstract:
- We present a new method for accelerating the rendering of complex static scenes. The technique is applicable to unstructured scenes containing arbitrary geometric primitives and has sublinear asymptotic complexity. Our approach is to construct a spatial hierarchy of cells over the scene and to associate with each cell a simplified representation of its contents. The scene is then rendered using a traversal of the hierarchy in which a cell's approximation is drawn instead of its contents if the approximation is sufficiently accurate. We apply the method to several different scenes and demonstrate significant speedups with little image degradation. We also exhibit and discuss some of the artifacts that our approximation may cause.
- Citation:
- Brad Chamberlain, Tony DeRose, Dani Lischinski, David H. Salesin, and John Snyder. Fast rendering of complex environments using a spatial hierarchy. Proceedings of Graphics Interface '96 (Toronto), 132-141, 1996.
- On-line documents:
- Text of article [compressed PostScript file, 38 Kb]
- Color plates 1 and 2 [compressed PostScript file, 225 Kb]
- Color plates 3 to 5 [compressed PostScript file, 1 Mb]
Global Illumination of Glossy Environments
Using Wavelets and Importance
- Abstract:
- We show how importance-driven refinement and a wavelet basis can be combined to provide an efficient solution to the global illumination problem with glossy and diffuse reflections. Importance is used to focus the computation on the interactions having the greatest impact on the visible solution. Wavelets are used to provide an efficient representation of radiance, importance, and the transport operator. We discuss a number of choices that must be made when constructing a finite element algorithm for glossy global illumination. Our algorithm is based on the standard wavelet decomposition of the transport operator and makes use of a four-dimensional wavelet representation for spatially- and angularly-varying radiance distributions. We use a final gathering step to improve the visual quality of the solution. Features of our implementation include support for curved surfaces as well as texture-mapped anisotropic emission and reflection functions.
- Citation:
- Per H. Christensen, Eric J. Stollnitz, David H. Salesin, and Tony D. DeRose. Global illumination of glossy environments using wavelets and importance. ACM Transactions on Graphics, 15(1):37-71, January 1996.
- On-line documents:
- Complete article [Acrobat file, 611 Kb]
- Complete article [compressed PostScript file, 2.8 Mb]
Hierarchical Image Caching for Accelerated Walkthroughs of Complex Environments
- Abstract:
- We present a new method for accelerating walkthroughs of geometrically complex static scenes. As a preprocessing step, our method constructs a BSP-tree that hierarchically partitions the geometric primitives in the scene. In the course of a walkthrough, images of nodes at various levels of the hierarchy are cached for reuse in subsequent frames. A cached image is applied as a texture map to a single quadrilateral that is drawn instead of the geometry contained in the corresponding node. Visual artifacts are kept under control by using an error metric that quantifies the descrepancy between the appearance of geometry contained in a node and the cached image. The new method is shown to achieve significant speedups for a walkthrough of a complex outdoor scene, with little or no loss in rendering quality.
- Citation:
- Jonathan Shade, Dani Lischinski, Tony D. DeRose, and John Snyder, David H. Salesin. Hierarchical image caching for accelerated walkthroughs of complex environments. Proceedings of SIGGRAPH 96, in Computer Graphics Proceedings, Annual Conference Series, 75-82, August 1996.
- On-line documents:
- Complete article [Acrobat pdf file, 180 Kb]
- Also available as Department of Computer Science and Engineering Technical Report TR 96-01-06.
- TR 96-01-06 [compressed postscript file, 40 Kb]
- Plate1 [compressed Postscript file, 824 Kb]
- Plate2 [compressed Postscript file, 704 Kb]
- Project web page
Interactive Multiresolution Surface Viewing
- Abstract:
- Multiresolution analysis has been proposed as a basic tool supporting compression, progressive transmission, and level-of-detail control of complex meshes in a unified and theoretically sound way.
- We extend previous work on multiresolution analysis of meshes in two ways. First, we show how to perform multiresolution analysis of colored meshes by separately analyzing shape and color. Second, we describe efficient algorithms and data structures that allow us to incrementally construct lower resolution approximations to colored meshes from the geometry and color wavelet coefficients at interactive rates. We have integrated these algorithms in a prototype mesh viewer that supports progressive transmission, dynamic display at a constant frame rate independent of machine characteristics and load, and interactive choice of tradeoff between the amount of detail in geometry and color. The viewer operates as a helper application to Netscape, and can therefore be used to rapidly browse and display complex geometric models stored on the World Wide Web.
- Citation:
- Andrew Certain, Jovan Popovic, Tony DeRose, Tom Duchamp, David H. Salesin, Werner Stuetzle. Interactive multiresolution surface viewing. Proceedings of SIGGRAPH 96, in Computer Graphics Proceedings, Annual Conference Series, 91-98, August 1996.
- Available as Department of Computer Science and Engineering Technical Report TR 96-01-07, University of Washington, 1996.
- On-line documents:
- Compelete article [Acrobat file, 435 Kb]
Multiresolution Video
- Abstract:
- We present a new representation for time-varying image data, called multiresolution video. The representation allows for varying -- and arbitrarily high -- spatial and temporal resolutions in different parts of a video sequence. The representation is based on a sparse, hierarchical encoding of the video data. We show how multiresolution video supports a number of primitive operations: drawing frames at a particular spatial and temporal resolution; and translating, scaling, and compositing multiresolution sequences. These primitives are then used as the building blocks to support a variety of applications: video compression; multiresolution playback, including motion-blurred "fast-forward" and "reverse"; constant speed display; enhanced video scrubbing; and "video clip art" editing and compositing. The multiresolution representation requires little storage overhead, and the algorithms using the representation are both simple and efficient.
- Citation:
- Adam Finkelstein, Charles E. Jacobs, David H. Salesin. Multiresolution Video. Proceedings of SIGGRAPH 96, in Computer Graphics Proceedings, Annual Conference Series, 281-290, August 1996.
- On-line documents:
- Complete article [Acrobat file, 651 Kb]
- Complete article [compressed Postscript file, 3.5 Mb]
- Article without color images [compressed Postscript file, 54Kb]
Rendering Parametric Surfaces in Pen and Ink
- Abstract:
- This paper presents new algorithms and techniques for rendering parametric free-form surfaces in pen and ink. In particular, we introduce the idea of "controlled-density hatching" for conveying tone, texture, and shape. The fine control over tone this method provides allows the use of traditional texture mapping techniques for specifying the tone of pen-and-ink illustrations. We also show how a planar map, a data structure central to our rendering algorithm, can be constructed from parametric surfaces, and used for clipping strokes and generating outlines. Finally, we show how curved shadows can be cast onto curved objects for this style of illustration.
- Citation:
- George Winkenbach, David H. Salesin. Rendering parametric surfaces in pen and ink. Proceedings of SIGGRAPH 96, in Computer Graphics Proceedings, Annual Conference Series, 469-476, August 1996.
- On-line documents:
- Available as Technical Report:
- 96-01-05 [compressed Postscript file, 801 Kb]
Reproducing Color Images as Duotones
- Abstract:
- We investigate a new approach for reproducing color images. Rather than mapping the colors in an image onto the gamut of colors that can be printed with cyan, magenta, yellow, and black inks, we choose the set of printing inks for the particular image being reproduced. In this paper, we look at the special case of selecting inks for duotone printing, a relatively inexpensive process in which just two inks are used. Specifically, the system we describe takes an image as input, and allows a user to select 0, 1, or 2 inks. It then chooses the remaining ink or inks so as to reproduce the image as accurately as possible and produces the appropriate color separations automatically.
- Citation:
- Joanna L. Power, Brad S. West, Eric J. Stollnitz, and David H. Salesin. Reproducing color images as duotones. Proceedings of SIGGRAPH 96, in Computer Graphics Proceedings, Annual Conference Series, 237-248, August 1996.
- On-line documents:
- Article without duotones [Acrobat file, 2.8 Mb]
- Article without duotones [compressed PostScript file, 3.0 Mb]
Scale-dependent Reproduction of Pen-and-ink Illustrations
- Abstract:
- This paper describes a compact resolution- and scale-independent representation for pen-and-ink illustrations. The proposed representation consists of a low-resolution grey-scale image, augmented by a set of discontinuity segments. We also present a new reconstruction algorithm that magnifies the low-resolution image while keeping the image sharp along the discontinuities. By storing pen-and-ink illustrations in this representation, we can produce high-fidelity illustrations at any scale and resolution by generating an image of the desired size and filling that image with pen-and-ink strokes.
- Citation:
- Mike Salisbury, Corey Anderson, Dani Lischinski, and David H. Salesin. Scale-dependent Reproduction of Pen-and-ink Illustration. Proceedings of SIGGRAPH 96, in Computer Graphics Proceedings, Annual Conference Series, 461-468, August 1996.
- On-line documents:
- Available as Technical Report:
- TR 96-01-02 [compressed PostScript file, 11.5 Mb]
Wavelets for Computer Graphics: Theory and Applications
- Preview:
- This distinctly accessible introduction to wavelets provides computer graphics professionals and researchers with the mathematical foundations for understanding and applying this new and powerful tool.
- Wavelets are rapidly becoming a core technique in computer graphics, with applications to:
- image editing and compression;
- automatic level-of-detail control for editing and rendering curves and surfaces;
- surface reconstruction from contours; and
- physical simulation for global illumination and animation.
- Stressing intuition and clarity, this book offers a solid understanding of the theory of wavelets and their proven applications in computer graphics.
- Although previous introductions to wavelets have presented an elegant mathematical framework, that framework is too restrictive to apply to many problems in graphics. In contrast, this book focuses on a generalized theory that naturally accommodates the kinds of objects that commonly arise in computer graphics, including images, open curves, and surfaces of arbitrary topology.
- The book also contains a foreword by Ingrid Daubechies and an appendix covering the necessary background material in linear algebra.
- Contents: See the table of contents.
- Citation:
- Eric J. Stollnitz, Tony D. DeRose, and David H. Salesin. Wavelets for Computer Graphics: Theory and Applications. Morgan Kaufmann, San Francisco, 1996.
- ISBN: 1-55860-375-1
- Ordering information: See Morgan Kaufmann's web site.
- On-line material:
- Matlab code from Appendix C [compressed tar file, 17 Kb]
The Virtual Cinematographer: a Paradigm for Automatic Real-Time Camera Control and Directing
- Abstract:
- This paper presents a paradigm for automatically generating complete camera specifications for capturing events in virtual 3D environments in real-time. We describe a fully implemented system, called the Virtual Cinematographer, and demonstrate its application in a virtual "party" setting. Cinematographic expertise, in the form of film idioms, is encoded as a set of small hierarchically organized finite state machines. Each idiom is responsible for capturing a particular type of scene, such as three virtual actors conversing or one actor moving across the environment. The idiom selects shot types and the timing of transitions between shots to best communicate events as they unfold. A set of camera modules, shared by the idioms, is responsible for the low-level geometric placement of specific cameras for each shot. The camera modules are also responsible for making subtle changes in the virtual actors' positions to best frame each shot. In this paper, we discuss some basic heuristics of filmmaking and show how these ideas are encoded in the Virtual Cinematographer.
- Citation:
- Li-Wei He, Michael F. Cohen, David H. Salesin. The virtual cinematographer: a paradigm for automatic real-time camera control and directing. Proceedings of SIGGRAPH 96, in Computer Graphics Proceedings, Annual Conference Series, 217-224, August 1996.
- On-line documents:
- Complete article [Acrobat file, 158 Kb]
Fast Multiresolution Image Querying
- Abstract:
- We present a method for searching in an image database using a query image that is similar to the intended target. The query image may be a hand-drawn sketch or a (potentially low-quality) scan of the image to be retrieved. Our searching algorithm makes use of multiresolution wavelet deompositions of the query and database images. The coefficients of these decompositions are distilled into small "signatures" for each image. We introduce an "image querying metric" that operates on these signatures. This metric essentially compares how many significant wavelet coefficients the query has in common with potential targets. The metric includes parameters that can be tuned, using a statistical analysis, to accommodate the kinds of image distortions found in different types of image queries. The resulting algoritm is simple, requires very little storage overhead for the database signatures, and is fast enough to be performed on a database of 20,000 images at interactive rates (on standard desktop machines) as a query is sketched. Our experiments with hundreds of queries in databases of 1000 and 20,000 images show dramatic improvement, in both speed and success rate, over using a conventional L1, L2, or color histogram norm.
- Citation:
- Charles E. Jacobs, Adam Finkelstein, David H. Salesin. Fast Multiresolution Image Querying. Proceedings of SIGGAPH 95, in Computer Graphics Proceedings, Annual Conference Series, pages 277-286, August 1995.
- On-line documents:
- Available as Technical Report TR 95-01-06:
- Complete report [compressed PostScript file, 474 Kb]
- Report without color plates [compressed PostScript file, 63 Kb]
Multiresolution Analysis of Arbitrary Meshes
- Abstract:
- In computer graphics and geometric modeling, shapes are often represented by triangular meshes. With the advent of laser scanning systems, meshes of extreme complexity are rapidly becoming commonplace. Such meshes are notoriously expensive to store, transmit, render, and are awkward to edit. Multiresolution analysis offers a simple, unified, and theoretically sound approach to dealing with these problems. Lounsbery et al. have recently developed a technique for creating multiresolution representations for a restricted class of meshes with subdivision connectivity. Unfortunatedly, meshes encountered in practice typically do not meet this requirement. In this paper we present a method for overcoming the subdivision connectivity restriction, meaning that completely arbitrary meshes can now be converted to multiresolution form. The method is based on the approximation of an arbitrary initial mesh M by a mesh M3 that has subdivision connectivity and is guaranteed to be within a specified tolerance.
- The key ingredient of our algorithm is the construction of a parametrization of M over a simple domain. We expect this parametrization to be of use in other contexts, such as texture mapping or the approximation of complex meshes by NURBS patches.
- Citation:
- Matthias Eck, Tony DeRose, Tom Duchamp, Hugues Hoppe, Michael Lounsbery, and Werner Stuetzle. Multiresolution Analysis of Arbitrary Meshes. Technical Report #95-01-02, January 1995.
- On-line documents:
- Article [compressed PostScript file 1.3 Mb]
- Color plate 1 [compressed Postscript file 1.1 Mb]
- Color plate 2 [compressed Postscript file 1.6 Mb]
- Color plate 3 [compressed Postscript file 1 Mb]
- Color plate 4 [[compressed Postscript file 1 Mb]
Wavelets for Computer Graphics: A Primer
- Abstract:
- Wavelets are a mathematical tool for hierarchically decomposing functions. Using wavelets, a function can be described in terms of a coarse overall shape, plus details that range from broad to narrow. Regardless of whether the function of interest is an image, a curve, or a surface, wavelets provide an elegant technique for representing the levels of detail present. This primer is intended to provide those working in computer graphics with some intuition for what wavelets are, as well as to present the mathematical foundations necessary for studying and using them. In Part 1, we discuss the simple case of Haar wavelets in one and two dimensions, and show how they can be used for image compression. Part 2 presents the mathematical theory of multiresolution analysis, develops bounded-interval spline wavelets, and describes their use in multiresolution curve and surface editing.
- Citations:
- Eric J. Stollnitz, Tony D. DeRose, and David H. Salesin. Wavelets for computer graphics: A primer, part 1. IEEE Computer Graphics and Applications, 15(3):76-84, May 1995.
- Eric J. Stollnitz, Tony D. DeRose, and David H. Salesin. Wavelets for computer graphics: A primer, part 2. IEEE Computer Graphics and Applications, 15(4):75-85, July 1995.
- On-line documents:
- Part 1 [Acrobat file, 264 Kb]
- Part 1 [compressed PostScript file, 473 Kb]
- Part 2 [Acrobat file, 865 Kb]
- Part 2 [compressed PostScript file, 417 Kb]
- Matlab code [compressed tar file, 17 Kb]
Computer-Generated Pen-and-Ink Illustration
- Abstract:
- This paper describes the principles of traditional pen-and-ink illustration, and shows how a great number of them can be implemented as part of an automated rendering system. It introduces "stroke textures," which can be used for achieving both texture and tone with line drawing. Stroke textures also allow resolution-dependent rendering, in which the choice of strokes used in an illustration is appropriately tied to the resolution of the target medium. We demonstrate these techniques using complex architectural models, including Frank Lloyd Wright's "Robie House."
- Citation:
- Georges Winkenbach and David H. Salesin. Computer-Generated Pen-and-Ink Illustration. Proceedings of SIGGRAPH 94 (Orlando, Florida, July 24-29, 1994). in Computer Graphics, Annual Conference Series, 1994.
- On-line documents:
- Available as Technical Report:
- TR 94-01-08b [compressed Postscript file, 2.2 Mb]
Interactive Pen-and-Ink Illustration
- Abstract:
- We present an interactive system for creating pen-and-ink illustrations. The system uses stroke textures--collections of strokes arranged in different patterns--to generate texture and tone. The user"paints" with a desired stroke texture to achieve a desired tone, and the computer draws all of the individual strokes.
- The system includes support for using scanned or rendered images for reference to provide the user with guides for outline and tone. By following these guides closely, the illustration system can be used for interactive digital halftoning, in which stroke textures are applied to convey details that would otherwise be lost in this black-and white medium.
- By removing the burden of placing individual strokes from the user, the illustration system makes it possible to create fine stroke work with a purely mouse-based interface. Thus, this approach holds promise for bringing high-quality balck-and white illustration to the world of personal computing and desktop publishing.
- Citation:
- Michael P. Salisbury, Sean E. Anderson, Ronen Barzel, and David H. Salesin. Interactive Pen-and-Ink Illustration. Proceedings of SIGGRAPH 94, in Computer Graphics Proceedings, Annual Conference Series, pages 101-108, July 1994.
- On-line documents:
- Complete article [Acrobat file, 21MB]
Multiresolution Analysis for Surfaces of Arbitrary Topological Type
- Abstract:
- Multiresolution analysis provides a useful and efficient tool for representing shape and analyzing features at multiple levels of detail. Although the technique has met with consderable success when applied to univariate functions, images, and more generally to functions defined on Rn, to our knowledge it has not been extended to functions defined on surfaces of arbitrary genus.
- In this report, we demonstrate that multiresolution analysis can be extended to surfaces of arbitrary genus using techniques from subdivision surfaces. We envision many applications for this work, including automatic level-of-detail control in high-performance graphics rendering, compression of CAD models, and acceleration of global illumination algorithms. We briefly sketch one of these applications, that of automatic level-of-detail control of polyhedral surfaces.
- Citation:
- Tony D. DeRose, Michael Lounsbery, Joe Warren. Multiresolution Analysis for Surfaces of Arbitrary Topological Type. Deptment of Computer Science and Engineering, University of Washington Technical Report TR 93-10-05, October 29, 1993.
- On-line documents:
- [PDF document, 1.0MB]
Multiresolution Curves
- Abstract:
- We describe a multiresolution curve representation, based on wavelets, that conveniently supports a variety of operations: smoothing a curve; editing the overall form of a curve while preserving its details; and approximating a curve within any given error tolerance for scan conversion. We present methods to support continuous levels of smoothing as well as direct manipulation of an arbitrary portion of the curve; the control points, as well as the discrete nature of the underlying hierarchical representation, can be hidden from the user. The multiresolution representation requires no extra storage beyond that of the original control points, and the algorithms using the representation are both simple and fast.
- Citation:
- Adam Finkelstein, David H. Salesin. Multiresolution Curves. In Proceedings of SIGGRAPH '94, pages 261-268. ACM, New York, 1994.
- On-line documents:
- Available as Technical Report:
- TR 94-01-06b [compressed PostScript file, 352Kb]
Multiresolution Painting and Compositing
- Abstract:
- We describe a representation for "multiresolution images"--images that have different resolutions in different places--and methods for creating such images using painting and compositing operations. These methods are very easy to implement, and they are efficient in both memory and speed. At a particular resolution, the representation requires space proportional only to the amount of detail actually present, and the most common painting operations, "over" and "erase," require time proportional only to the number of pixels displayed. Finally, we show how "fractional-level zooming" can be implemented in order to allow a user to display and edit portions of a multiresolution image at any arbitrary size.
- Citation:
- Deborah F. Berman, Jason T. Bartell, David H. Salesin. Multiresolution Painting and Compositing. Proceedings of SIGGRAPH 94, in Computer Graphics Proceedings, Annual Conference Series, 85-90, July 1994.
- On-line documents:
- Available as Technical Report:
- TR 94-01-09b [compressed PostScript file, 8.8 Mb]
Multiresolution Tiling
- Abstract:
- This paper describes an efficient method for constructing a tiling between a pair of planar contours. The problem is of interest in a number of domains, including medical imaging, biological research and geological reconstructions. Our method, based on ideas from multiresolution analysis and wavelets, requires O(n) space and appears to require O(nlogn) time for average inputs, compared to the O(n2) space and O(n2logn) time required by the optimizing algorithm of Fuchs, Kedem and Uselton. The results computed by our algorithm are in many cases nearly the same as those of the optimizing algorithm, but at a small fraction of the computational cost. The performance improvement makes the algorithm usable for large contours in an interactive system. The use of multiresolution analysis provides an efficient mechanism for data compression by discarding wavelet coefficients smaller than a threshold value during reconstruction. The amount of detail lost can be controlled by appropreiate choice of the threshold value. The use of lower resolution approximations to the original contours yields significant savings in the time required to display a reconstructed object, and in the space required to store it.
- Citation:
- David Meyers. Multiresolution Tiling. Computer Graphics Forum, December 1994.
- On-line documents:
- Complete article [compressed PostScript file, 233Kb]
- Graphics Interface '94 version [compressed PostScript file, 189Kb]
Piecewise Smooth Surface Reconstruction
- Abstract:
- We present a general method for automatic reconstruction of accurate, concise, piecewise smooth surface models from scattered range data. The method can be used in a viariety of applications such as reverse engineering--the automatic generation of CAD models from physical objects. Novel aspects of the method are its ability to model surfaces of arbitrary topological type and to recover sharp features such as creases and corners. The method has proven to be effective, as demonstrated by a number of examples using both simulated and real data.
- A key ingredient in the method, and a principal contribution of this paper, is the introduction of a new class of piecewise smooth surface representations based on subdivision. These surfaces have a number of properties that make them ideal for use in surface reconstruction: they are simple to implement, they can model sharp features concisely, and they can be fit to scattered range data using an unconstrained optimization procedure.
- Citation:
- Hugues Hoppe, Tony DeRose, Tom Duchamp, Mark Halstead, Hubert Jin, John McDonald, Jean Schweitzer, Werner Stuetzle. Piecewise Smooth Surface Reconstruction. Computer Graphics Proceedings, Annual Conference Series, 1994.
- On-line documents:
- Complete article [compressed PostScript file, 98Kb]
Wavelet Radiance
- Abstract:
- In this paper, we show how wavelet analysis can be used to provide an efficient solution method for global illumination with glossy and diffuse reflections. Wavelets are used to sparsely represent radiance distribution functions and the transport operator. In contrast to previous wavelet methods (for radiosity), our algorithm transports light directly among wavelets, and eliminates the pushing and pulling procedures.
- The framework we describe supports curved surfaces and spatially-varying anisotropic BRDFs. We use importance to make the global illumination problem tractable for complex scenes, and we use a final gathering step to improve the visual quality of the solution.
- Citation:
- Per H. Christensen, Eric J. Stollnitz, David H. Salesin, and Tony D. DeRose. Wavelet radiance. In G. Sakas, P. Shirley, and S. Müller, editors, Photorealistic Rendering Techniques, pages 295-309. Springer-Verlag, Berlin, 1995.
- Reprinted from Proceedings of the Fifth Eurographics Workshop on Rendering (Darmstadt, Germany, June 1994), pages 287-302.
- On-line documents:
- Complete article [Acrobat file, 263 Kb]
- Complete article [compressed PostScript file, 1.4 Mb]
Electronic ``How Things Work'' Articles: Two Early Prototypes
- Abstract:
- The Electronic Encyclopedia Exploratorium (E3) is a vision of a future computer system--a kind of electronic ``How Things Work'' book. Typical articles in E3 will describe such mechanisms as compression refrigerators, engines, telescopes, and mechanical linkages. Each article will provide simulations, 3-dimensional animated graphics that the user can manipulate, laboratory areas that allow a user to modify the device or experiment with related artifacts, and a facility for asking questions and receiving customized, computer-generated English language explanations. In this paper, we discuss some of the foundational technology--especially focusing on topics in articial intelligence, graphics, and user interfaces--needed to achieve this long-term vision. We describe our two initial prototypes and the technical lessons we've learned from them.
- Citation:
- F. Amador, Deborah Berman, Alan Borning, Tony D. DeRose, Adam Finkelstein, D. Neville, David Notkin, David H. Salesin, Michael Salisbury, J. Sherman, Y. Sun, D. Weld, G. Winkenbach. Electronic ``How Things Work'' articles: Two early prototypes. IEEE Transactions on Knowledge and Data Engineering 5(4): 611-618, August 1993.
- On-line documents:
- Earlier version of article [Postscript file, 306 Kb]
Mesh Optimization
- Abstract:
- We present a method for solving the following problem: Given a set of data points scattered in three dimensions and an initial triangular mesh M0, produce a mesh M, of the same topological type a M0, that fits the data well and has a small number of vertices. Our approach is to minimize an energy function that explicitly models the competing desires of conciseness of representation and fidelity to the data. We show that mesh optimization can be effectively used in at least two applications: surface reconstruction from unorganized points, and mesh simplification (the reduction of the number of vertices in an initially dense mesh of triangles).
- Citation:
- Hughes Hoppe, Tony DeRose, Tom Duchamp, John McDonald, Werner Stuetzle. Mesh Optimization. In SIGGRAPH 93 Conference Procedings. ACM, New York, 1993.
- On-line documents:
- Available as Technical Report:
- TR 93-01-01 [compressed PostScript file, 972 Kb]
Three-Dimensional Computer Graphics:
A Coordinate Free Approach
- Preface:
- This manuscript is intended as a rigorous introduction to the field of computer graphics at a level appropriate for advanced undergraduates and beginning graduate students in computer science. My intent is not to present a completely comprehensive survey of the field. Rather, my goal is to provide a firm, modern account of those topics within the subfield of three-dimensional raster graphics that can be given adequate treatment in a ten week session. I have therefore, unfortunately, been forced to eliminate discussions of many interesting topics. The text by Foley, van Dam, Feiner, and Hughes should be considered a primary reference for topics not covered here.
- The manuscript is based on two courses (CSE 457 and 557) that I have taught over the past several years. The most distinguishing feature is the treatment of the geometric component of the material. Rather than using coordinate calculations, matrices, and matrix manipulations to accomplish geometric computations, a so-called coordinate-free approach is used. It is my feeling that a great deal of conceptual clarity and programming power is achieved by moving to the slightly higher level of abstraction provided by the coordinate- free framework.
- Citation:
- Tony D. DeRose, unpublished manuscript, 1993.
- On-line documents:
- Complete manuscript [compressed PostScript file, 1.3 Mb]
- Coordinate-free library for geometric programming:
A Continuous Adjoint Formulation for Radiance Transport
- Abstract:
- We describe a continuous adjoint formulation for radiance transport that allows a global illumination algorithm to focus on the directional interactions that contribute most to the visible scene. We show how the adjoint quantity for radiance can be described by an angular distribution that is only piecewise-continuous. This observation motivates the formulation of a related adjoint quantity, called exitant directional importance, whose angular distribution is continuous. We prove that exitant directional importance is equivalent to radiance in the sense that the two quantities satisfy the same transport equation and can be propagated through the environment in the same fashion.
- An adjoint formulation can dramatically reduce the time to compute radiosities when much of the scene is invisible. We present some preliminary results that demonstrate how the adjoint formulation for radiance can provide significant speed-ups even when all surfaces are visible.
- Citation:
- Per H. Christensen, David H. Salesin, Tony D. DeRose. Proceedings of the Fourth Eurographics Workshop on Rendering (Paris, France), 95-104, 1992.
- On-line documents:
- Article without figures [Acrobat file, 107 Kb]
Reconstructing Illumination Functions with Selected Discontinuities
- Abstract:
- Typical illumination functions contain boundaries that are discontinuous in intensity or derivative. These discontinuities arise from contact between surfaces, and from the penumbra and umbra boundaries of shadows cast by area light sources. In this paper, we present an algorithm that allows for smooth (C1) reconstruction of intensity everywhere across a surface except along selected edges of intensity or derivative discontinuity. The reconstruction algorithm is based on a piecewise-cubic scattered data interpolation method originally proposed by Clough and Tocher. Our results show marked improvement over piecewise linear or C1 quadratic reconstructions of some simple illumination functions.
- Citation:
- Dani Lischinski, Tony D. DeRose, David H. Salesin. Proceedings of the Third Eurographics Workshop on Rendering (Bristol, England), 99-112, 1992.
- On-line documents:
- Article without figures [Acrobat file, 155 Kb]
Surface Reconstruction from Unorganized Points
- Abstract:
- We describe and demonstrate an algorithm that takes as input an unorganized set of points {x1,...,xn}
on or near an unknown manifold M, and produces as output a simplicial surface that approximates M. Neither the topology, the presence of boundaries, nor the geometry of M are assumed to be known in advance - all are inferred automatically from the data. This problem naturally arises in a variety of practical situations such as range scanning an object from multiple view points, recovery of biological shapes from two-dimensional slices, and interactive surface sketching.
- Citation:
- Hugues Hoppe, Tony DeRose, Tom Duchamp, John McDonald, Werner Stuetzle. Surface Reconstruction from Unorganized Points. Computer Graphics Proceedings, Annual Conference Series, August 1992, pages 295-302. Available as Department of Computer Science and Engineering Technical Report TR 91-12-03, University of Washington, 1991.
- On-line documents:
- Directory containing TR version
People | Courses | Projects| Publications | Theses | Software/Data | Images | Home Page
Comments to grail-webmaster@cs.washington.edu 14 June 2001 sns