Tristan Swedish Dissertation Defense

Dissertation Title: Computational Discovery of Hidden Cues in Photographs


Images of everyday scenes often contain hidden information that can be extracted to localize objects outside the view of the camera and to see around corners. For example, we show that it is possible to look at shadows cast by an object on a table, such as a teapot, and reconstruct an image of the surrounding room. We describe how to identify and make use of these "hidden cues" such as shadows, reflections, and other subtle changes in an image caused by the interaction of light with objects in a scene that are not in the direct-line-of-sight. We use the term "computational discovery" to describe techniques that can be used to uncover these cues and reveal hidden information.

Despite incredible advances in computer vision in recent years, cameras are limited to a single viewpoint of a scene, requiring invasive multi-camera setups or active imaging modalities to solve many perception tasks today. Prior work has identified hidden cues that are present in photographs of certain environments, but these methods often require human insight to identify cues, and extensive calibration to make use of them. In order to address the limitations found in prior work, we propose an end-to-end machine learning framework to identify hidden cues. More generally, we show that object localization is approximately equal to localizing a point light source, and describe how this insight can be used to identify situations when object localization is possible. Furthermore, we show that physically-based "inverse rendering" can be used to estimate how light travels within a scene, turning objects, like coffee cups or picture frames, into "object cameras". Physical models are quite fragile to small errors in estimated scene parameters. As such, we suggest reconstruction methods that make use of the uncertainty in scene parameters to improve robustness.

The thesis suggests a number of other interesting ways hidden cues may be used in combination with imaging systems. This work could inspire future cameras that incorporate the environment itself as part of the imaging system, blurring the line between observer and subject.

Committee members:

Prof. Ramesh Raskar
Associate Professor, MIT Media Lab

Prof. Roarke Horstmeyer
Assistant Professor, Duke University

Prof. Ashok Veeraraghavan
Professor, Rice University

More Events