NeRF Regularization Framework
NeRF Regularization Framework
Abstract
3077
Authorized licensed use limited to: UNIVERSIDADE FEDERAL FLUMINENSE. Downloaded on October 23,2024 at 14:02:21 UTC from IEEE Xplore. Restrictions apply.
variants incorporate constraints to enforce smoothness be- the gradient of the model, or even higher order operators as
tween neighboring samples in space, such as RegNeRF [26] shown later.
(see Section 3). InfoNeRF [18] prevents inconsistencies Traditionally, NeRFs are defined in terms of rays, which
due to insufficient viewpoints by minimizing a ray entropy are characterized by an origin and a viewing direction (o, v).
model and the KL-divergence between the normalized ray Consequently d from (4) is parameterized by (o, v) instead
density obtained from neighbor viewpoints. In contrast, ex- of the image coordinates (x, y) as d˜ in (5). Let C : R2 → R3
ternally supervised regularization methods usually penalize be the transformation that converts the image coordinates
differences with respect to extrinsic geometric cues. Depth- ˜ y) = d(o, C(x, y)) .
into the equivalent ray so that d(x,
supervised NeRF [11] encourages the rendered depth (4) to Then the corresponding gradients are
be consistent with a sparse set of 3D surface points obtained
by structure from motion. A similar strategy is used in [20], ˜ y) = JC (x, y)∇v d(o, v),
∇(x,y) d(x, (6)
based on a set of 3D points refined by bundle adjustment; where v = C(x, y) and JC is the Jacobian matrix of C.
or [32], where a sparse point cloud is converted into dense This way, Eq. (5) can be expressed in terms of rays with the
depth priors by means of a depth completion network. exception of JC that could be computed at the same time as
the corresponding rays during the dataloading process. In
3. A generic regularization framework practice, we use a simplified regularization loss that avoids
computing JC (see Eq. (11)).
One of the major challenges when training a NeRF with
insufficient data is to learn a consistent scene geometry so Link with RegNeRF. In order to improve the robustness of
that the model extrapolates well on unseen views. In that NeRFs when training with few data, Niemeyer et al. [26]
case, it is common to add additional priors to the model to proposed RegNeRF, which also uses an additional term in
improve the quality of the learned models. the loss function to regularize the predicted depth map. This
A classic hypothesis in depth and disparity estimation is work additionally proposed an appearance regularization
that the target is smooth [15, 35]. The same a priori can be term using a normalizing flow network trained to estimate the
applied to the scene modeled by the NeRF. Due to the ability likelihood of a predicted patch compared to normal patches
of NeRFs to model transparent surfaces and volumes, the from the JFT-300M dataset [41]. While the later is not
predicted weights can be highly irregular. As a consequence, studied here, we show that their depth regularization term
it is easier to regularize across different rendered viewpoints is simply an approximation of the more generic differential
(i.e. after projection onto a given camera) rather than directly loss presented in Eq. (5).
regularizing the 3D scene itself. This means that instead of Consider the depth map d˜ and the set of coordinates (x, y)
using the depth function d from Eq. (4), it is more appropriate that corresponds to the pixels of the depth map. RegNeRF
to work with the depth map d˜ produced by the NeRF model regularizes depth by encouraging that neighboring pixels
from a given viewpoint. This depth map d˜ is then indexed (x + i, y + j) for i, j ∈ {0, 1}2 and i + j = 1 have the same
by its 2D coordinate (x, y) instead of a ray in the 3D space. depth as the pixel (x, y) such as
In image processing, a classic way of enforcing smooth-
ness is to add a regularization term in the loss function based Ldepth = ˜ + i, y + j) − d(x,
(d(x ˜ y))2 , (7)
on the gradients of the image. For example, penalizing the (x,y) (i,j)∈{0,1}2
i+j=1
squared L2 norm of the gradients has the effect of remov-
ing high gradients in the depth map thus enforcing it to be ˜
smooth, as desired. In addition, it does not penalize slanted which is a finite difference expression of the gradient of d.
surfaces (since they have null Laplacian) as it would happen Thus the major difference between (7) and our approach
in the case of using a total variation regularization [33]. The is that (7) approximates the gradient with finite differences
proposed regularization term thus reads while we take advantage of automatic differentiation.
In practice, RegNeRF regularization is not done on the
entire depth maps but rather by sampling patches. The loss
Ldepth = ˜ y)2 , gmax ).
clip(∇d(x, (5)
(7) is computed not only for all patches corresponding to a
(x,y)
view in the training dataset, but also for rendered patches
In practice, we add a differentiable clipping to Ldepth , whose observation is not available. Indeed, all views should
parametrized by gmax , to preserve sharp edges that could verify this depth regularity property, not only those in the
otherwise be over-smoothed. training data. As a result, RegNeRF requires modifying the
By only changing the ReLU activation function to a Soft- dataloaders to incorporate patch-based sampling and rays
plus activation function, the MLP used in NeRF can be corresponding to unseen views. Note that our depth regular-
transformed into a continuous and infinitely differentiable ization term (5) does not require patches and therefore can be
function similarly to [14]. This allows to train directly using directly applied using single rays sampling as traditionally
3078
Authorized licensed use limited to: UNIVERSIDADE FEDERAL FLUMINENSE. Downloaded on October 23,2024 at 14:02:21 UTC from IEEE Xplore. Restrictions apply.
Orthographic Perspective and ∇d(x,˜ y)2 = ∇o d(o, v), i 2 + ∇o d(o, v), j 2 . Since
projection projection
(i, j, v) is, by construction, an orthonormal basis of the
space, we also have that ∇o d(o, v)2 = ∇o d(o, v), i 2 +
∇o d(o, v), j 2 + ∇o d(o, v), v 2 thus
Ldepth = ∇o d(o, v)2 − ∇o d(o, v), v 2 (10)
(o,v)∈R
o
= ∇o d(o, v) − ∇o d(o, v), v v2 . (11)
(o,v)∈R
Figure 2. All perspective projection rays originate at the same
center of projection o, located at a finite distance from the image Note how Eq. (11) does not depend on the choice of (i, j),
plane. The center of projection in orthographic projection is at is entirely defined by the knowledge of the ray (o, v) and is
infinity, which can be represented by using a different origin for
independent from JC (x, y).
each ray, so that the origin points are parallel to the image plane.
4. Experimental results
done to train NeRFs. This makes the proposed framework
We test the impact of the proposed differential regular-
compatible with previous single ray regularization methods,
ization on the task of scene estimation using only three
such as InfoNeRF [18], and non-uniform ray sampling, im-
input views. This an extreme test case and, as such, it is
portant when working with 360° images [27]. It also does
highly reliant on the quality of the regularization to avoid
not regularize unseen views as explained in Section 4.
catastrophic collapse as shown by Niemeyer et al. [26] for
Normals regularization. The regularization term (5) re- mip-NeRF [2]. In order to compare the proposed formaliza-
lies on depth maps. However, differential geometry also tion of RegNeRF [26] to its original version, we modified
allows us to regularize other geometry-related features when the code of the authors and replaced their depth loss by the
training a NeRF. For example, consider n, the function that one in (11). We refer to our approach as DiffNeRF. The code
returns the scene normals for a given ray, whose projection, to reproduce the results presented in this section is available
or map of normals, is denoted ñ. In that case, the regulariza- at https://github.com/tehret/diffnerf.
tion of the normals of the scene becomes
Results on LLFF [23]. In Table 1, we compare the re-
Lnormals = Jñ (x, y)2F (8) sults of the original RegNeRF (using the models trained by
(x,y) the authors) with our DiffNeRF formalization (11). Since
the code released by the authors does not contain the addi-
where Jñ is the Jacobian of the map of normals. This regu-
tional appearance loss, we added another comparison that
larizer was applied to generate one of the results in Fig. 1.
corresponds to RegNeRF without the additional appearance
Simplified regularization loss. The main problem with regularization (i.e. training from scratch using the available
the loss presented in Eq. (5) is that it does not depend only code). The proposed DiffNeRF not only improves by 1dB
on each individual ray, but also requires additional camera the PSNR of reconstructed unseen views compared to the
information to compute JC . Since this can be unpractical equivalent RegNeRF version, it also outperforms RegNeRF
depending on the camera model, we propose to use a differ- with appearance regularization by almost 0.5dB. This is also
ent and fixed local camera model only for the regularization the case for other metrics such as SSIM and LPIPS.
process. Instead of using the usual perspective projection Visual results on two examples of the LLFF dataset are
models associated with the training data, it is possible to shown in Fig. 3. In both cases, we compare the proposed
regularize the scene as if the ray being processed originated version with the models trained by Niemeyer et al. [26]. The
from an orthographic projection camera, as illustrated in horns scene in Fig. 3 shows a first example where our formal-
Fig. 2. ization outperforms RegNeRF across all evaluation metrics.
Consider a ray defined by its origin o and its direction v). The proposed method is able to learn a better geometry of
Let (i, j) be a local orthonormal basis of the plane defined the image, leading to a more complete reconstruction of the
by o and v. Using an orthographic projection camera, the triceratops skull (see the horn on the right), but also of the
direction is fixed and only the origin changes to obtain other rest of the scene, such as the sign panel in the foreground or
rays from the same camera. Therefore C, defined such that the handrails in the background. Similar improvements can
˜ y) = d(C(x, y), v), is explicit and C(x, y) = xi + yj.
d(x, be observed in the trex scene.
i Fig. 1 shows another result, with the room scene of the
This leads to JC (x, y) = ∈ R2×3 . Therefore
j LLFF dataset where the PSNR obtained with DiffNeRF is
˜ y) = ∇o d(o, v), i
∇(x,y) d(x, (9)
worse with respect to RegNeRF with appearance regulariza-
∇o d(o, v), j tion. However, the depth map estimated by our formalism
3079
Authorized licensed use limited to: UNIVERSIDADE FEDERAL FLUMINENSE. Downloaded on October 23,2024 at 14:02:21 UTC from IEEE Xplore. Restrictions apply.
fern flower fortress horns leaves orchids room trex avg.
PixelNeRF ft [47] - - - - - - - - 16.17
SRF ft [8] - - - - - - - - 17.07
PSNR
DiffNeRF (ours) 0.703 0.707 0.761 0.680 0.645 0.487 0.864 0.791 0.705
RegNeRF [26] 0.697 0.688 0.743 0.610 0.613 0.502 0.861 0.766 0.685
RegNeRF (w/o app. reg.) 0.323 0.243 0.294 0.341 0.229 0.259 0.204 0.197 0.261
LPIPS
DiffNeRF (ours) 0.290 0.223 0.219 0.293 0.186 0.247 0.171 0.166 0.224
RegNeRF [26] 0.304 0.234 0.258 0.356 0.222 0.251 0.185 0.197 0.251
Table 1. Quantitative comparison of novel view synthesis for different NeRF regularization on the LLFF dataset. All models were trained
using only three input views. RegNeRF (w/o app. reg.) corresponds to the original RegNeRF without appearance regularization, while the
proposed framework is DiffNeRF. The results using RegNerf with appearance regularization are also provided for reference. The proposed
regularization almost systematically achieves the best results across all metrics without requiring any additional appearance regularization.
The LPIPS metric is computed using the official implementation provided by Zhang et al. [48]. Best results are shown in bold.
Table 2. Quantitative comparison of novel view synthesis for different NeRF regularization on the DTU dataset. All models were trained
using only three input views. RegNeRF (w/o app. reg.) corresponds to the original RegNeRF without appearance regularization, while the
proposed framework is DiffNeRF. The results using RegNerf with appearance regularization are also provided for reference. The case of
scenes 41 and 82 are discussed in Section 4. Best results are shown in bold.
3080
Authorized licensed use limited to: UNIVERSIDADE FEDERAL FLUMINENSE. Downloaded on October 23,2024 at 14:02:21 UTC from IEEE Xplore. Restrictions apply.
RegNeRF [26] RegNeRF (w/o app. reg.) DiffNeRF (ours) Ground truth
horns
trex
Figure 3. Visual examples of novel view synthesis for the horns (top) and trex (bottom) sequences of the LLFF dataset after training with
three views. The depth map produced by the proposed DiffNeRF is more regular than those produced RegNeRF. It also recovers more details
both in the foreground (see the sign panel on the left or the triceratops’ left horn) but also in the background (see the glass panels and the
handrails).
3081
Authorized licensed use limited to: UNIVERSIDADE FEDERAL FLUMINENSE. Downloaded on October 23,2024 at 14:02:21 UTC from IEEE Xplore. Restrictions apply.
RegNeRF [26] RegNeRF (w/o app. reg.) DiffNeRF (ours) Ground truth
scan30
scan40
Figure 4. Visual example of novel view synthesis for scenes 30 and 40 of the DTU dataset after training with three views. The depth map
produced by the proposed DiffNeRF is more regular than those produced RegNeRF. It also separates better the object from the background.
(a) (b)
(c) (d)
Figure 5. Visual impact of the two parameters of the regularization Figure 6. Failure cases for scenes 41 and 82 of the DTU dataset
(reconstructions from three views). (a) strong regularization and reconstructed from three views. "Floaters" (groups of points with a
clipping, (b) strong regularization and little clipping, (c) medium non zero density and disjoint from the scene) hide portions of the
regularization and clipping, (d) little regularization and clipping. scene when synthesizing novel views.
3082
Authorized licensed use limited to: UNIVERSIDADE FEDERAL FLUMINENSE. Downloaded on October 23,2024 at 14:02:21 UTC from IEEE Xplore. Restrictions apply.
is possible to compute other differential quantities related
to surface regularity, such as the curvature. This allows to
directly regularize the surface instead of regularizing the
projections of the scene as shown in Section 3. We propose
in this section to look at the Gaussian curvature γgauss and
the mean curvature γmean , since they both have an analytical
Figure 7. Visual example of a regularized reconstruction of the
expression that can be easily implemented using the existing
scene 40 of the DTU dataset. From left to right: regularized recon-
deep learning frameworks. struction using Gaussian curvature (13), original VolSDF results
These curvatures are respectively defined as and ground truth.
∇F
γmean = −div (12)
∇F
6. Conclusions
and
∇F × H ∗ (F ) × ∇F t With DiffNeRF, a variant of NeRF that relies on differen-
γgauss = , (13) tial geometry to regularize the depth or the normals of the
∇F 4
learned scene, we demonstrated that it is possible to achieve
where H ∗ is the adjoint of the Hessian of F . Derivation
state-of-the-art novel view synthesis and depth estimation
details of these two curvatures can be found in [13]. Using
in few-shot neural rendering with a simple yet flexible reg-
(12) and (13), we can define a regularization loss similar to
ularization framework. This is made possible by modern
the one presented in Section 3 as
deep learning frameworks, which already provide the nec-
Lcurv (κcurv ) = Ex∈S [min (|γ(x)|, κcurv )] , (14) essary tools to implement differential geometry operators,
thus facilitating their use in practice. However, the use of
where γ can be either γmean or γgauss , depending on the differential geometry is still subject to certain limitations.
preferred behavior, and κcurv is a clipping value. The final Higher-order operators can be costly both in memory and in
loss to train a regularized VolSDF model using (14) becomes computation time so a careful choice of the regularization
term is essential. Operators should be chosen differently
depending on the problem at hand. For example, a Gaussian
L = LRGB + λSDF LSDF + λcurv Lcurv (κcurv ) (15)
curvature regularization may be appropriate for flat surfaces
with with strong edges, such as buildings, but could fill holes in ir-
LSDF = Ex∈R3 (∇F (x) − 1)2 . (16) regular surfaces. The vast literature on differential geometry
opens up many exciting opportunities to define new regular-
As in [45], the LSDF term enforces the Eikonal constraint
ization tools with the appropriate mathematical formalism,
on the implicit function F , thus learning a signed distance
which we hope pushes the limits of neural rendering even
function. Note that (15) makes it possible to regularize the
further. Additional studies to understand the impact of the ac-
surface directly during training instead of doing it in different
tivation function (such as softplus, squareplus [1], sine [38],
stages as in [44].
Gaussian [9], etc.) on the results are also necessary.
The regularization is characterized by the same two pa-
rameters, the regularization weight and the clipping value,
than the regularization presented in Section 3. To understand
Acknowledgements
the impact of these parameters, we refer to the definition of Work partly financed by Office of Naval research grant
the mean and Gaussian curvature in terms of the minimum N00014-17-1-2552, MENRT, and Kayrros. It was also per-
curvature γmin and maximum curvature γmax of the surface formed using HPC resources from GENCI–IDRIS (grants
at a given point AD011012453R2 and AD011011801R3) and from the “Mé-
γmin +γmax socentre” computing center of CentraleSupélec and ENS
γmean = 2 and γgauss = γmin γmax . (17)
Paris-Saclay supported by CNRS and Région Île-de-France
Although this is not a practical definition of the curvature, (http://mesocentre.centralesupelec.fr). Centre Borelli is
since it does not allow for direct computation, it shows that also with Université Paris Cité, SSA and INSERM.
minimizing the mean curvature leads to surface smooth-
ing [10]. On the other hand, minimizing the Gaussian cur- References
vature forces the minimum curvature to be zero, resulting
[1] Jonathan T. Barron. Squareplus: A Softplus-Like Algebraic
in flat surfaces with sharp straight edges. The visual impact
Rectifier, 2021. 8
on the reconstructed surfaces is shown in the supplementary
[2] Jonathan T Barron, Ben Mildenhall, Matthew Tancik, Peter
material. An example of a regularized reconstruction using Hedman, Ricardo Martin-Brualla, and Pratul P. Srinivasan.
Gaussian curvature is shown in Fig 7.
3083
Authorized licensed use limited to: UNIVERSIDADE FEDERAL FLUMINENSE. Downloaded on October 23,2024 at 14:02:21 UTC from IEEE Xplore. Restrictions apply.
Mip-NeRF: A multiscale representation for anti-aliasing neu- [15] Heiko Hirschmuller. Stereo processing by semiglobal match-
ral radiance fields. In Proceedings of the IEEE/CVF Interna- ing and mutual information. IEEE Transactions on pattern
tional Conference on Computer Vision (ICCV), pages 5855– analysis and machine intelligence, 30(2):328–341, 2007. 3
5864, 2021. 2, 4 [16] Ajay Jain, Matthew Tancik, and Pieter Abbeel. Putting NeRF
[3] Mark Boss, Raphael Braun, Varun Jampani, Jonathan T Bar- on a diet: Semantically consistent few-shot view synthesis.
ron, Ce Liu, and Hendrik Lensch. NeRD: Neural reflectance In Proceedings of the IEEE/CVF International Conference
decomposition from image collections. In Proceedings of on Computer Vision (ICCV), pages 5885–5894, 2021. 1, 2
the IEEE/CVF International Conference on Computer Vision, [17] Rasmus Jensen, Anders Dahl, George Vogiatzis, Engin Tola,
pages 12684–12694, 2021. 1 and Henrik Aanæs. Large scale multi-view stereopsis evalu-
[4] Shengqu Cai, Anton Obukhov, Dengxin Dai, and Luc ation. In Proceedings of the IEEE conference on computer
Van Gool. Pix2NeRF: Unsupervised conditional π-GAN vision and pattern recognition, pages 406–413, 2014. 5
for single image to neural radiance fields translation. Pro- [18] Mijeong Kim, Seonguk Seo, and Bohyung Han. InfoNeRF:
ceedings of the IEEE/CVF Conference on Computer Vision Ray entropy minimization for few-shot neural volume ren-
and Pattern Recognition (CVPR), 2022. 2 dering. In Proceedings of the IEEE/CVF Conference on
[5] Eric R Chan, Marco Monteiro, Petr Kellnhofer, Jiajun Wu, Computer Vision and Pattern Recognition (CVPR), 2022. 1,
and Gordon Wetzstein. pi-GAN: Periodic implicit generative 2, 3, 4
adversarial networks for 3D-aware image synthesis. In Pro- [19] Zhengqi Li, Simon Niklaus, Noah Snavely, and Oliver Wang.
ceedings of the IEEE/CVF Conference on Computer Vision Neural scene flow fields for space-time view synthesis of
and Pattern Recognition (CVPR), pages 5799–5809, 2021. 2 dynamic scenes. In Proceedings of the IEEE/CVF Conference
[6] Anpei Chen, Zexiang Xu, Fuqiang Zhao, Xiaoshuai Zhang, on Computer Vision and Pattern Recognition, pages 6498–
Fanbo Xiang, Jingyi Yu, and Hao Su. Mvsnerf: Fast general- 6508, 2021. 1
izable radiance field reconstruction from multi-view stereo. [20] Roger Marí, Gabriele Facciolo, and Thibaud Ehret. Sat-NeRF:
In Proceedings of the IEEE/CVF International Conference Learning multi-view satellite photogrammetry with transient
on Computer Vision, pages 14124–14133, 2021. 5 objects and shadow modeling using RPC cameras. Proceed-
[7] Di Chen, Yu Liu, Lianghua Huang, Bin Wang, and Pan Pan. ings of the IEEE/CVF Conference on Computer Vision and
GeoAug: Data Augmentation for Few-Shot NeRF with Geom- Pattern Recognition (CVPR) Workshops, 2022. 1, 3
etry Constraints. In Computer Vision – ECCV 2022, Lecture [21] Ricardo Martin-Brualla, Noha Radwan, Mehdi SM Sajjadi,
Notes in Computer Science, pages 322–337. Springer Nature Jonathan T Barron, Alexey Dosovitskiy, and Daniel Duck-
Switzerland, 2022. 1 worth. NeRF in the wild: Neural radiance fields for uncon-
[8] Julian Chibane, Aayush Bansal, Verica Lazova, and Gerard strained photo collections. In Proceedings of the IEEE/CVF
Pons-Moll. Stereo radiance fields (srf): Learning view syn- Conference on Computer Vision and Pattern Recognition
thesis for sparse views of novel scenes. In Proceedings of (CVPR), pages 7210–7219, 2021. 1
the IEEE/CVF Conference on Computer Vision and Pattern [22] Nelson Max. Optical models for direct volume rendering.
Recognition, pages 7911–7920, 2021. 5 IEEE Transactions on Visualization and Computer Graphics,
[9] Shin-Fang Chng, Sameera Ramasinghe, Jamie Sherrah, and 1(2):99–108, 1995. 2
Simon Lucey. Gaussian activated neural radiance fields for [23] Ben Mildenhall, Pratul P Srinivasan, Rodrigo Ortiz-Cayon,
high fidelity reconstruction and pose estimation. In European Nima Khademi Kalantari, Ravi Ramamoorthi, Ren Ng, and
Conference on Computer Vision, pages 264–280. Springer, Abhishek Kar. Local light field fusion: Practical view synthe-
2022. 8 sis with prescriptive sampling guidelines. ACM Transactions
[10] Ulrich Clarenz, Udo Diewald, and Martin Rumpf. Anisotropic on Graphics (TOG), 38(4):1–14, 2019. 4
geometric diffusion in surface processing. In Proceedings of [24] Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik,
the IEEE Visualization Conference, 2000. 8 Jonathan T Barron, Ravi Ramamoorthi, and Ren Ng. NeRF:
[11] Kangle Deng, Andrew Liu, Jun-Yan Zhu, and Deva Ramanan. Representing scenes as neural radiance fields for view syn-
Depth-supervised NeRF: Fewer views and faster training for thesis. In European Conference on Computer Vision, pages
free. In Proceedings of the IEEE/CVF Conference on Com- 405–421, 2020. 1, 2
puter Vision and Pattern Recognition (CVPR), 2022. 2, 3 [25] Pierre Moulon, Pascal Monasse, Romuald Perrot, and Renaud
[12] Yasutaka Furukawa and Jean Ponce. Accurate, dense, and Marlet. OpenMVG: Open multiple view geometry. In In-
robust multi-view stereopsis (pmvs). In IEEE Computer Soci- ternational Workshop on Reproducible Research in Pattern
ety Conference on Computer Vision and Pattern Recognition, Recognition, pages 60–74. Springer, 2016. 1
volume 2, page 3, 2007. 1 [26] Michael Niemeyer, Jonathan T Barron, Ben Mildenhall,
[13] Ron Goldman. Curvature formulas for implicit curves and Mehdi SM Sajjadi, Andreas Geiger, and Noha Radwan. Reg-
surfaces. Computer Aided Geometric Design, 22(7):632–658, NeRF: Regularizing neural radiance fields for view synthesis
2005. 8 from sparse inputs. In Proceedings of the IEEE/CVF Confer-
[14] Amos Gropp, Lior Yariv, Niv Haim, Matan Atzmon, and ence on Computer Vision and Pattern Recognition (CVPR),
Yaron Lipman. Implicit geometric regularization for learning 2022. 1, 2, 3, 4, 5, 6, 7
shapes. In International Conference on Machine Learning, [27] Takashi Otonari, Satoshi Ikehata, and Kiyoharu Aizawa. Non-
pages 3789–3799. PMLR, 2020. 3 uniform Sampling Strategies for NeRF on 360° images. In
3084
Authorized licensed use limited to: UNIVERSIDADE FEDERAL FLUMINENSE. Downloaded on October 23,2024 at 14:02:21 UTC from IEEE Xplore. Restrictions apply.
33rd British Machine Vision Conference 2022, {BMVC} 2022, [41] Chen Sun, Abhinav Shrivastava, Saurabh Singh, and Abhi-
London, UK, November 21-24, 2022. BMVA Press, 2022. 4 nav Gupta. Revisiting unreasonable effectiveness of data in
[28] Keunhong Park, Utkarsh Sinha, Jonathan T Barron, Sofien deep learning era. In Proceedings of the IEEE international
Bouaziz, Dan B Goldman, Steven M Seitz, and Ricardo conference on computer vision, pages 843–852, 2017. 3
Martin-Brualla. Nerfies: Deformable neural radiance fields. [42] Ayush Tewari, Ohad Fried, Justus Thies, Vincent Sitz-
In Proceedings of the IEEE/CVF International Conference mann, Stephen Lombardi, Kalyan Sunkavalli, Ricardo Martin-
on Computer Vision (ICCV), pages 5865–5874, 2021. 1 Brualla, Tomas Simon, Jason Saragih, Matthias Nießner, et al.
[29] Keunhong Park, Utkarsh Sinha, Peter Hedman, Jonathan T. State of the art on neural rendering. Computer Graphics
Barron, Sofien Bouaziz, Dan B Goldman, Ricardo Martin- Forum, 39(2):701–727, 2020. 2
Brualla, and Steven M. Seitz. HyperNeRF: A higher- [43] Edgar Tretschk, Ayush Tewari, Vladislav Golyanik, Michael
dimensional representation for topologically varying neural Zollhöfer, Christoph Lassner, and Christian Theobalt. Non-
radiance fields. ACM Trans. Graph., 40(6), 2021. 1 rigid neural radiance fields: Reconstruction and novel view
[30] Albert Pumarola, Enric Corona, Gerard Pons-Moll, and synthesis of a dynamic scene from monocular video. In
Francesc Moreno-Noguer. D-NeRF: Neural radiance fields Proceedings of the IEEE/CVF International Conference on
for dynamic scenes. In Proceedings of the IEEE/CVF Confer- Computer Vision (ICCV), pages 12959–12970, 2021. 1
ence on Computer Vision and Pattern Recognition (CVPR), [44] Guandao Yang, Serge Belongie, Bharath Hariharan, and
pages 10318–10327, 2021. 1 Vladlen Koltun. Geometry processing with neural fields. Ad-
[31] Daniel Rebain, Mark Matthews, Kwang Moo Yi, Dmitry vances in Neural Information Processing Systems, 34:22483–
Lagun, and Andrea Tagliasacchi. LOLNeRF: Learn from one 22497, 2021. 8
look. Proceedings of the IEEE/CVF Conference on Computer [45] Lior Yariv, Jiatao Gu, Yoni Kasten, and Yaron Lipman. Vol-
Vision and Pattern Recognition (CVPR), 2021. 2 ume rendering of neural implicit surfaces. Advances in Neural
[32] Barbara Roessle, Jonathan T Barron, Ben Mildenhall, Pratul P Information Processing Systems, 34, 2021. 6, 8
Srinivasan, and Matthias Nießner. Dense depth priors for [46] Lior Yariv, Yoni Kasten, Dror Moran, Meirav Galun, Matan
neural radiance fields from sparse input views. In Proceedings Atzmon, Basri Ronen, and Yaron Lipman. Multiview neural
of the IEEE/CVF Conference on Computer Vision and Pattern surface reconstruction by disentangling geometry and appear-
Recognition (CVPR), 2022. 2, 3 ance. Advances in Neural Information Processing Systems,
[33] Leonid I Rudin, Stanley Osher, and Emad Fatemi. Nonlinear 33, 2020. 6
total variation based noise removal algorithms. Physica D: [47] Alex Yu, Vickie Ye, Matthew Tancik, and Angjoo Kanazawa.
nonlinear phenomena, 60(1-4):259–268, 1992. 3 pixelNeRF: Neural radiance fields from one or few images.
[34] Ewelina Rupnik, Mehdi Daakir, and Marc Pierrot Deseilligny. In Proceedings of the IEEE/CVF Conference on Computer
Micmac–a free, open-source solution for photogrammetry. Vision and Pattern Recognition (CVPR), pages 4578–4587,
Open Geospatial Data, Software and Standards, 2(1):1–9, 2021. 1, 2, 5
2017. 1
[48] Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman,
[35] Daniel Scharstein and Richard Szeliski. A taxonomy and eval- and Oliver Wang. The unreasonable effectiveness of deep
uation of dense two-frame stereo correspondence algorithms. features as a perceptual metric. In CVPR, 2018. 5
International journal of computer vision, 47(1):7–42, 2002. 3
[49] Xiuming Zhang, Pratul P Srinivasan, Boyang Deng, Paul
[36] Johannes L Schonberger and Jan-Michael Frahm. Structure-
Debevec, William T Freeman, and Jonathan T Barron. NeR-
from-motion revisited. In Proceedings of the IEEE confer-
Factor: Neural factorization of shape and reflectance under
ence on computer vision and pattern recognition, pages 4104–
an unknown illumination. ACM Transactions on Graphics
4113, 2016. 1
(TOG), 40(6):1–18, 2021. 1
[37] Katja Schwarz, Yiyi Liao, Michael Niemeyer, and Andreas
Geiger. Graf: Generative radiance fields for 3d-aware im-
age synthesis. Advances in Neural Information Processing
Systems, 33:20154–20166, 2020. 2
[38] Vincent Sitzmann, Julien N. P. Martel, Alexander W.
Bergman, David B. Lindell, and Gordon Wetzstein. Implicit
Neural Representations with Periodic Activation Functions.
volume 33, pages 7462–7473. Advances in neural information
processing systems, 2020. 8
[39] Noah Snavely, Steven M Seitz, and Richard Szeliski. Photo
tourism: exploring photo collections in 3d. In ACM siggraph
2006 papers, pages 835–846. ACM Press, 2006. 1
[40] Pratul P Srinivasan, Boyang Deng, Xiuming Zhang, Matthew
Tancik, Ben Mildenhall, and Jonathan T Barron. NeRV:
Neural reflectance and visibility fields for relighting and view
synthesis. In Proceedings of the IEEE/CVF Conference on
Computer Vision and Pattern Recognition (CVPR), pages
7495–7504, 2021. 1
3085
Authorized licensed use limited to: UNIVERSIDADE FEDERAL FLUMINENSE. Downloaded on October 23,2024 at 14:02:21 UTC from IEEE Xplore. Restrictions apply.