TLDR; Given a collection of images under extreme illumination variations, we make the lighting consistent and we get a consistent NeRF!
Here we show our results on our synthetic dataset with rendering trajectories of the reconstructions under the reference illumination. Please see the full set of results here
Here we show our results on real scenes from the NAVI dataset.
Given a collection of photos taken with varying illuminations, we select the image with the desired illumination as reference, then we use a multiview diffusion model to relight the images to match the reference. We then use a reflection-aware NeRF to reconstruct the object given the relit images. The shading embeddings allow us to model per-image normal variations, which we explain below.
In NeRF, reflections are modeled by feeding-in the view dependent effects into the appearance MLP. However, it is challenging to fit reflections perfectly. NeRF-Casting address this issue by explicitly modeling reflections off surfaces using the normals predicted by an MLP. When the input images have inconsistencies like in our relit images, the typical solution is to use appearance embeddings, but this results in diffuse objects with incorrectly "static" reflections, as we show below. To address this issue, we model the small remaining inconsistencies in our relit images as variations in the normal vectors, and we handle those by feeding into the normal prediction MLP an additional per-image learnable vector which we call shading embeddings.
We show that using prior work's appearance embeddings result in diffuse appearance. On the other hand, when using our shading embeddings, we can preserve reflections and specular highlights
Since we have an input with varying illumination, we can choose any image as a reference and reconstruct the object under the reference illumination
Compared to recent state of the art generative relighting, our relighting method is much more consistent when comparing different sampling outputs. While the baseline shows significant lighting variations, the variance in our outputs appears as small displacement of the specular highlights, further motivating our shading embeddings.
We would like to thank Matthew Burruss and Xiaoming Zhao for their help with the rendering pipeline. We also thank Ben Poole, Alex Trevithick, Stan Szymanowicz, Rundi Wu, David Charatan, Jiapeng Tang, Matthew Levine, Ruiqi Gao, Ricardo Martin-Brualla, and Aleksander Hołyński for fruitful discussions.
@misc{alzayer2024generativemvr,
title={Generative Multiview Relighting for {3D} Reconstruction under Extreme Illumination Variation},
author={Alzayer, Hadi and Henzler, Philipp and Barron, Jonathan T. and Huang, Jia-Bin and Srinivasan, Pratul P. and Verbin, Dor},
year={2024},
eprint={2412.15211},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2412.15211},
}