Designing Physically-Motivated CNNs for Shape and Material

 

Abstract:

Image formation is a physical phenomenon that often includes complex factors like shape deformations, occlusions, material properties and participating media. Consequently, practical deployment of intelligent vision-based systems requires robustness to the effects of diverse factors. Such effects may be inverted by modeling the image formation process, but hand-crafted features and hard-coded rules face limitations for data inconsistent with the model. Recent advances in deep learning have led to impressive performances, but generalization of a purely data-driven approach to handle such complex effects is expensive. Thus, an avenue for successfully handling the diversity of real-world images is through incorporation of physical models of image formation within deep learning frameworks.

This talk presents some of our recent works on the design of convolutional neural networks (CNNs) that incorporate physical insights from image formation to learn 3D shape, semantics and material properties, with benefits such as higher accuracy, better generalization or greater ease of training. To handle complex materials, we propose novel 4D CNN architectures for material recognition from a single light field image and model bidirectional reflectance distributions functions (BRDF) to acquire spatially-varying material properties from a single mobile phone image. To handle participating media, we propose a generative adversarial network (GAN) that inverts the effects of complex distortions induced by a turbulent refractive interface. To recover semantics 3D shape despite occlusions, we model the rendering process as deep supervision for intermediate layers in a CNN, which effectively bridges domain gaps to yield superior performance on real images despite training purely on simulations. Finally, we demonstrate state-of-the-art face recognition from profile views through an adversarial learning framework that frontalizes input faces using 3D morphable models.​

 

Bio:

Manmohan Chandraker is an assistant professor at the CSE department of the University of California, San Diego and heads computer vision research at NEC Labs America. He received a PhD from UCSD and was a postdoctoral scholar at UC Berkeley. His research interests are 3D scene understanding and reconstruction, with applications to autonomous driving and human-computer interfaces. His works have received the Marr Prize Honorable Mention for Best Paper at ICCV 2007, the 2009 CSE Dissertation Award for Best Thesis at UCSD, a PAMI special issue on best papers of CVPR 2011 and the Best Paper Award at CVPR 2014.