Casual Scene Capture and Editing for AR/VR Applications

Pulkit Gera

Abstract

Augmented Reality and Virtual Reality(AR/VR) applications can become far more widespread if they can photo-realistically capture our surroundings and modify them in different ways. It could include editing the scene’s lighting, changing the objects’ material, or augmenting virtual objects onto the scene. There has been a significant amount of work done in this domain. However, most of these works capture the data in a controlled setting consisting of expensive setups such as light stages. These methods are impractical and cannot scale. Thus, we must design solutions that capture scenes casually from offthe-shelf devices commonly available to the public. Further, the user should be able to interact with the captured scenes and modify these scenes in exciting directions, such as editing the material or augmenting new objects into the scene. In this thesis, we study how we can produce novel views of a casually captured scene and modify them in interesting ways. First, we present a neural rendering framework for simultaneous novel view synthesis and appearance editing of a casually captured scene using off-the-shelf smartphone cameras under known illumination. Existing approaches cannot perform novel view synthesis and edit the materials of the scene objects. We propose a method to explicitly disentangle appearance from lighting while estimating radiance and learn an independent lighting estimation of the scene. This allows us to generalize arbitrary changes in the scene’s materials while performing novel view synthesis. We demonstrate our results on synthetic and real scenes. Next, we present PanoHDR-NeRF, a neural representation of an indoor scene’s high dynamic range (HDR) radiance field that can be captured casually without elaborate setups or complex capture protocols. First, a user captures a low dynamic range (LDR) omnidirectional video of the scene by freely waving an off-the-shelf camera around the scene. Then, an LDR2HDR network converts the captured LDR frames to HDR, which are subsequently used to train a modified NeRF++ model. The resulting PanoHDR-NeRF representation can synthesize full HDR panoramas from any location in the scene. We also show that the HDR images produced by PanoHDR-NeRF can synthesize correct lighting effects, enabling the augmentation of indoor scenes with synthetic objects that are lit correctly. Through these works, we demonstrate how we can casually capture scenes for AR/VR applications that the user can further edit.

Year of completion:	August 2022
Advisor :	P J Narayanan, Jean-François Lalonde