Navigating 3D Scene Reconstruction: NeRF and Photogrammetry
In the ever-evolving landscape of 3D scene reconstruction, Neural Radiance Fields (NeRF) and photogrammetry stand out as pivotal technologies. Photogrammetry, with roots tracing back to the mid-19th century, has matured significantly, with accessible tools proliferating in recent decades. In contrast, NeRF is a newer entrant, made accessible by the release of NVIDIA Instant NeRF in 2022, offering a ready-to-use solution.
This article highlights the main differences and similarities in dataset creation, image alignment, and processing. It also shows the practical applications of these technologies in a production pipeline. Within our research, we will use knowledge of photogrammetry and NeRF comparisons to leverage their strengths for 3D scene reconstruction.
Comparing the Setup Processes for Photogrammetry and NeRF
Crafting the Dataset
Starting with data capture, both NeRF (Neural Radiance Fields) and photogrammetry follow similar principles and face comparable constraints. The quality and sharpness of images are paramount, which means selecting the right lens, using manual focus, and avoiding digital stabilization. For the best possible image quality, it is recommended to avoid using fisheye lenses when capturing images. It’s worth noting that the sharpness of the photos or videos is a more crucial factor than their resolution. For maintaining consistency and enhancing the quality of images, employing gimbals is advisable for capturing both photographs and videos. And once again, it is important not to use digital image stabilization as it can disarray the optical center and focal length of the camera.
Alignment of Images
The process of initial alignment of images serves as the foundation for both NeRF and photogrammetry, with experience playing a vital role in navigating this phase. While working with the camera, it is advisable to learn additional information about image capturing nuances, such as selecting the proper lens and correct focal length which will significantly improve the success rate. Another working piece of advice, when capturing small objects, it is better to set manual focus while taking photos or videos of them in loops. This approach yields more consistent results compared to using autofocus.
Processing
In terms of 3D reconstruction processing, NeRF is generally faster than photogrammetry, but it is constrained by VRAM (video random access memory) limitations, even on advanced GPUs (graphics processing units). These VRAM limitations restrict the scale and detail of reconstructable models compared to photogrammetry. On the other hand, RAM (random access memory), that photogrammetry uses, is more accessible and affordable, with consumer machines supporting over 100GB, which can fit thousands of images. In contrast, even advanced GPUs like RTX3090 and RTX4090 are limited to 24GB VRAM, sufficient for hundreds of images.
Nonetheless, in work, NeRF’s volumetric approach particularly excels in processing thin, intricate objects and transparent or reflective surfaces and areas where photogrammetry faces challenges. This is due to NeRF’s ability to adjust the density threshold, enabling nuanced geometry processing.
After conducting a thorough analysis of each technique, we have identified their respective strengths and weaknesses. Based on this information, we can not only compare these two technologies but leverage their strengths and integrate them into our 3D scene reconstruction process.
NeRF or Photogrammetry or both?
Maturity and Workflow Integration
The current development stage of NeRF tools contrasts with the maturity of photogrammetry, leading to integration challenges in existing workflows. However, our experiments involving scene reconstruction with both technologies have shown promising results, particularly when combining NeRF geometry with photogrammetry. In our example, while photogrammetry approach had some issues rendering the thin parts, NeRF was flawless. By integrating the NeRF-generated model and then matching it to the world position inside the photogrammetry scene and projected textures, not limited to vertex colors, we achieved the best of both worlds.
Insights and Experiments
In our experiments, we have discovered that NeRF’s ability to handle thin parts is exceptionally beneficial. With the help of this characteristic, we were able to successfully reconstruct intricate objects and complete their subsequent reintegration into photogrammetry software for enhanced texturing. The end result was a detailed and visually accurate 3D model, showcasing the potential of combining these technologies.
However, there are some challenges, namely NeRF’s limitations in terms of hardware requirements and the nascent state of tools. Integrating NeRF into existing workflows requires careful consideration, especially given the VRAM limitations that can restrict the scale and detail of the models.
Conclusion
The integration of NeRF and photogrammetry opens avenues for innovative solutions in 3D scene reconstruction and its further application in various business domains. While each technology has its limitations, their combined use showcases the potential to overcome individual challenges and achieve detailed, accurate, and textured models. As NeRF continues to evolve and hardware becomes more advanced, the possibilities for further integration and enhanced applications in 3D modeling are boundless.
In tsukat, we have already tried and tested each of these technologies in practice. In particular, our engineers developed a custom 3D face reconstruction framework suitable for various industries, from retail to healthcare. Thanks to the research and thorough work of our RnD department, this technology is fast, accurate, and accessible to anyone with a mobile phone. Among the approaches used to achieve such high-quality results was photogrammetry. You can learn more about this work in our white paper dedicated to 3D face reconstruction technology.
As an XR development company, it is essential for us to stay in the foreground of innovative technologies, such as 3D scene reconstruction. The profound technical expertise of our engineers helps us excel in delivering pragmatic software solutions that enhance decision-making, foster customer engagement, and optimize overall business performance.
Additional Information about photogrammetry and NeRF
What is photogrammetry?
Photogrammetry is a method that involves taking numerous 2D photographs of real-world objects or scenes from different angles to generate a point cloud. An overlap of these pictures allows one to assemble a three-dimensional scene or object. The process of transforming 2D images into 3D involves identifying common features in the images, such as points, and then using triangulation techniques to specify the 3D location of those features.
Photogrammetry proves to be a valuable tool for a range of professions, including engineers, designers, and architects. Furthermore, the museum industry can also leverage this technology to preserve cultural heritage sites and broaden access to potential exhibitions by enabling curators to showcase their collections virtually. But like any technology, it has its limitations. Unlike NeRF, photogrammetry moves much slower and poorly handles an object’s thin details.
What is NeRF?
NeRF, also known as the Neural Radiance Field, employs neural networks to model and generate realistic 3D scenes by processing a given set of 2D images as input. Unlike photogrammetry, NeRF processing does not require every image for precise 3D models. NeRF uses AI to create complete and accurate scenes by filling in missing information and blending proposed images. Such “gap filling” accelerates the generation of 3D imagery and constitutes a significant factor contributing to the rapid adoption and popularity of this technique. NeRF has already found its place in various industries and is considered highly beneficial while working with thin and reflective objects. Nonetheless, NeRF is a relatively new technology that remains uncharted territory with several noteworthy drawbacks.