Paper Information
ECCV 2020 oral
- title : Nerf : Representing Scenes as Neural Radiance Fields for View Synthesis
- journal : Communications of the ACM
- Author : Ben Fildenhall, Pratul P. Srinivasan, Matthew tancik et al.

https://www.matthewtancik.com/nerf
NeRF: Neural Radiance Fields
A method for synthesizing novel views of complex scenes by optimizing an underlying continuous volumetric scene function using a sparse set of input views.
www.matthewtancik.com

Abstract
Main method : optimizes an underlying continuous volumetric scene function using a sparse set of input views.
Purpose? : synthesizes novel view of complex scenes
novelty :
1) effectively optimizes neural radiance fields to render photorealistic novel views of scenes
2) outperforms prior work on neural renderin and view synthesis
How? :
1) adjusts MLP ( fully-connected deep network )
input - 5D coordinates (x, y, z, θ, ϕ)
output - volume density & view-dependent emitted radiance
2) synthesizes views by querying 5D coordinates along camera rays & use classic volume rendering
Introduction
In this work, we address view synthesis in a new way by directly optimizing parameters of a continuous 5D scene representation to minimize the error of renderinng a set of captured images.

main ideas

(a) -> (b) : produce color and density using MLP

(c) -> (d) : composite outputs into an image
Summary
We suggest three main ideas
1. approaching for representing continuous scenes with complex geometriy and materials
2. differentiable rendering procedure based on classical volume rendering techniques
3. positional encoding => optimizing neural radiance fields to represent high-frequency scene content.
2. Related work
past : represent scene using discrete representations( triangle meshes, voxel grids, and so on)
=> no differentiable = hard to reproduce realistic location
2.1. Neural 3D shape representations
3D shapes : xyx / occupancy field
[limit] cannot acces to groundtruth 3D geometry
[solution] 3D occupancy field + implicit differentiation
=> However, this results in oversmoothed rendering
Therefore, we suggest new method that use 5D radiance fields
2.2. View Synthesis and image-based rendering
novel view synthesis with sparse view sampling : significant process
1) mesh-based representations of scene with diffuse & view-dependent appearance
=> differentiable rasterizer (by GDS)
[limit] often difficult, initialization before optimization is unable in real world
2) volumetric representations
=> good for complex shapes
[limit] due to poor time, space complexity scaling higher resolution is unable (because of discrete)
Therefore, we use volumetric represents but using a continuous volume insted of discrete one.
3. Neural Radiance Field Scene Representation
input : 5D coordincation (x, y, z, θ, ϕ)
x, y, z : 3D location
θ, ϕ : 2D viewing direction
output : (c, σ)
c : (r, g, b)
σ : volume density
MLP F_θ : (x, d) -> (c, σ)
d : 3d cartasian vector
input => MLP F_θ => output

4. Volume Rendering with Radiance Field
σ(x) : volume density (differential probability of ray)
C(r) : expected color

=> numeral estimation of above functions

+) if we use to discrete sets, we use below function

5. Optimizing a neural Radiance Field

5.1. Positional encoding
deep networks : biased towards learning low frequency
=> Rahaman et al : mapping input into higher dimension
We leverege those method and show reformulating F_θ as a composition of two functions.

γ : mapping from R into a higher dim R^2L

-> applied seperately each of values in x (in this experiment, we set L=10 for γ(x) and L=4 for γ(d))
+) It is similar to positional encoding in Transformer, but has different goals.
5.2. Hierarchical volume sampling
For increasing rendering efficiency, we propose a hierarchical representation
=> using two simultaneous network (course, fine)
1. course network : using N_c samples

C_c(r) : weighted sum of all sampled color
2. fine network : using N_c + N_f samples
5.3. Implementation details
loss function

6. Results
Quantitatively and qualitatively show that this method outperforms prior work





7. Conclusion
This work adresses deficiencies of prior work using
(1) MLPs to represent objects and scenes as continuous functions
(2) 5D neura radiance fields
(3) hierarchical sampling strategy
Future work
sampled representations(such as voxel grids and meshes) admit reasoning about the expected quality of rendered views and failure modes.
=> believe that this work makes progress towards a graphics pipeline based on real world imagery, where complex scenes could be composed of neural radiance elds optimized from images of actual objects and scenes.
'Computer Science > AI' 카테고리의 다른 글
[3DGS] SplaTAM: Splat, Track & Map 3D Gaussians for Dense RGB-D SLAM (0) | 2025.03.12 |
---|---|
[3DGS] A Survey on 3D Gaussian Splatting (0) | 2025.03.08 |
[3DGS] 3D Gaussian Splatting for Real-Time Radiance Field Rendering : Paper Review (0) | 2025.02.11 |
[3DGS] gaussian function - N-dim gaussian (0) | 2025.02.07 |
[3DGS] Gaussian function - 1D gaussian (0) | 2025.02.07 |