If you have an Adobe Acrobat reader or plug-in, click here to read the paper.
To acquire a Postscript version of the paper by ftp click here. It appears in Proceedings of SIGGRAPH 1995.
In a preprocessing step, we perform a wavelet transform on every image in the database. By collecting just the few largest coefficients from this transform, we distill a small ``signature'' for each of these images. We save these signatures, organized in such a way that it is easy (and fast) to compare them all to a new signature.
Then, when the user submits a query, we perform the wavelet transform to produce a signature for the query image. This new signature is compared to the signatures of the database images, and the best matches are retrieved for the user.
|20 coeffs||100 coeffs||400 coeffs||Original (16,000 coeffs)|
The wavelet transform is a tool that has emerged from the mathematics community in last decade or so for analyzing functions at different levels of detail. It is similar to the Fourier transform, but encodes both frequency and spacial information.
Much of the work in wavelet research relating to images has been in the area of image compression. By saving the few largest wavelet coefficients for an image (and throwing away all of the smaller coefficients), it is possible to recover a fairly accurate representation of the image. This property may be exploited for compression. For example, the image shown above that incorporates 400 coefficients would require about 3% as much disk space as the original image.
In our application, we take the wavelet transform and keep just a few (20) coefficients for each color channel and distill from them a very small ``signature'' for each image. Because the signature is so small, it permits very fast searching in the database. Nonetheless, it has proven to be an effective method for discriminating images.
Our application allows the user to paint a query image in a Macpaint- or Photoshop-like way. Alternately, the user may load a query image from disk, perhaps one that was scanned or captured by a video camera. When the user presses a ``match'' button, the application retrieves the 20 most similar images (by our wavelet-based metric) from the database. These are displayed as small thumbnails in the upper right part of the window. If one of these was the desired image, the user simply clicks on that thumbnail to retrieve the full-sized image. Otherwise, if the desired image doesn't appear among the thumbnails, the user may modify the query and ``match'' again.
Because the queries may be matched against the database very quickly, we have implemented an ``interactive'' mode for the application. In this mode, the application re-evaluates the painted query in the database as each stroke is drawn. Every time the user pauses from painting for a moment, the application retrieves a new set of thumbnails. Median time to retrieve the target image appears to be around twenty seconds for a database of 1000 images.
Or, if you have an Adobe Acrobat reader or plug-in, click here to read the paper.
With this larger database, the user may search for images using the same painting application. But when the user clicks on a thumbnail, the full-size image is retrieved over the Web. We have yet to perform tests to see how effective our method is at retrieving specific images from this larger database.