In this work, we want to generate a 3D scene from data. The neural renderer will use
Deep learning techniques
context encoder
trained an encoder-decoder model to fill in a central square hole in an image, using a combination of l2 regression on pixel values, and an adversarial loss [30]
The idea of directly re-using the pixels from avail- able images to generate new views has been popular in computer graphics.
While these methods yield high-quality novel views, they do so by composting the corresponding input image rays for each output pixel and can therefore only generate already seen content, (e.g. they cannot create the rear-view of a car from available frontal and side-view images).
MPI paper
Why is this better than image-based disocclusion?
Generative image inpainting with contextual attention.
↩