MindSpore-powered Reconstruction of Transparent Objects with Self-Occlusion Aware Refraction-Tracing
MindSpore-powered Reconstruction of Transparent Objects with Self-Occlusion Aware Refraction-Tracing
Paper title:
NeTO: Neural Reconstruction of Transparent Objects with Self-Occlusion Aware Refraction-Tracing
Source:
ICCV2023
Paper URL:
Code URL:
https://github.com/nauyihsnehs/NeTO-MindSpore
The MindSpore community supports analysis on top-level conference papers and promotes original AI achievements. In this blog, I'd like to share the paper of the team led by Pro. Xiao Chunxia, School of Computer Science, Wuhan University.
1. Research Background
Reconstructing a three-dimensional model of a real object has always been a long-standing challenge. It has been researched in computer vision and graphics for decades and has promoted the development of many applications, such as augmented reality, autonomous driving, and robotics. However, the existing general-purpose multi-view reconstruction methods are only applicable to opaque objects that approximate the Lambertian surface, and few of the methods can process transparent objects because the light paths passing through the transparent object are extremely complex and involve reflection and refraction.
In some countries, more advanced methods for reconstructing a three-dimensional model of a transparent object have been proposed. These methods are optimized by using correspondences between camera light rays and locations on a background monitor or by rotating the background monitor to enforce the camera light rays to be consistent with refracted light rays. However, these methods use point clouds or meshes to represent transparent object surfaces, and usually require a large number of views as input. When there are not enough images as input, it is prone to fail to reconstruct a real geometric shape due to unstable optimization.
How to deal with a self-occluded part of an object is still a critical problem. The widely used refraction-tracing consistency assumes that when light passes through a transparent object, the camera light is refracted only twice on the object surface (on entry and exit). However, when the camera light passes through the self-occluded part, the assumption does not hold, and the light in the self-occluded part is refracted by the curved surface more than twice. Therefore, enforcing refraction-tracing consistency on the self-occluded part is a mistake and will inevitably introduce errors in the reconstruction optimization process.
2. Team Introduction
Li Zongcheng, the first author of the paper, graduated in 2023 with a master's degree from the School of Computer Science, Wuhan University. His research focuses on 3D reconstruction of transparent objects and inverse rendering.
Xiao Chunxia, correspondent author of the paper, positions as the professor of the School of Computer Science at Wuhan University. He is honored as distinguished talent by the Ministry of Education, and mainly engaged in research in computer graphics, virtual reality, augmented reality, and computer vision. He has published more than 160 papers, including over 80 papers in international authoritative or SCI academic journals such as TOG, TPAMI, IJCV, and TVCG, and over 30 papers at top academic conferences such as CVPR, ICCV, ECCV and AAAI. More over, he has been granted a number of national patents and software copyrights, awarded with national natural science awards and presided in multiple national major science projects.
3. Introduction to the Paper
This paper proposed a new method, NeTO, for reconstructing high-quality 3D geometric shape of transparent objects. In this paper, the implicit Signed Distance Function (SDF) is used to represent the transparent object surface, and the implicit SDF is optimized by using the volume rendering and a self-occlusion aware refractive ray tracing technique. Implicit representation enables our method to reconstruct high-quality images on a limited view dataset. This paper also proposes a simple and effective strategy to detect self-occluded parts and avoid imposing incorrect constraints on these parts. This strategy determines whether the camera light is reversible based on the law of reversibility of the light path. That is, if the direction of the light is reversed, the light remains in the same path as the light before the reversal, regardless of the number of reflections or refractions. In this paper, it is assumed that the light is refracted twice. In order to verify the method, full-view and sparse-view experiments are performed on the DRT dataset and custom dataset. A large number of experiments show that the method in this paper can achieve high-quality reconstruction of transparent objects, and outperforms the previous methods.
The following figure shows the transparent object self-occlusions and self-occlusion detection strategy. The left figure shows two beams of light. rp is a light that is not self-occluded, and re is a self-occluded light. The rp light undergoes two refractions (on entry and exit), and is a valid light ray. The re light undergoes multiple refractions, indicating that self-occlusion exists, which affects the reconstruction precision. Therefore, the re needs to be discarded. The right figure shows the self-occlusion detection strategy. There is no self-occlusion on the path of the rp light. Therefore, the SDF values of the points on the rp light path are less than 0. Self-occlusion exists on the re light path. Therefore, in the re light path within the object, there are points whose SDF values are either less than 0 or greater than 0.

Figure-1 Self-occlusions and related detection strategy
In this paper, the implementation based on MindSpore can be divided into two parts: rough reconstruction and fine reconstruction. During the rough reconstruction, the implicit SDF is optimized to reconstruct the contour of the transparent object. In the fine reconstruction, self-occlusion and two refractions are considered to reconstruct the details of the transparent object. In both reconstruction stages, the MindSpore nn and ops operators are used to build models, and the value_and_grad operator is used to calculate the gradient and perform optimization. The load_checkpoint and save_checkpoint functions are also used to load and save network weights. In general, the operator APIs of MindSpore are clear, simple, and functionally complete, offering good user experience.
4. Experiment Results
In this paper, full-view and sparse-view experiments are performed on the DRT dataset and custom dataset. To evaluate the quality of the reconstructed model, this paper calculates the quantitative indicators including accuracy, completeness, precision, recall, and F-score, between the reconstructed model and the real model. Experiments show that the method in this paper can achieve high-quality reconstruction of transparent objects, and is better than the previous methods.

Table-1 Sparse view experiment result

Table-2 Full-view experiment result
5. Summary and Prospects
In this paper, a transparent object reconstruction method is proposed, which uses an implicit SDF to represent a transparent object surface, and uses a volume rendering technology to enhance refraction-tracing consistency. In addition, a self-occlusion detection strategy is proposed, and the reconstruction geometry of the self-occlusion object is further improved.
MindSpore provides easy-to-use APIs and flexible building modules, enabling developers to quickly get started and build various deep learning models with flexibility. As an open source framework, MindSpore is doing a great job in performance optimization, especially in distributed training. In addition, MindSpore supports different hardware platforms, offering a wide range of choices. The MindSpore ecosystem is growing, covering more application scenarios and industries, offering more solutions. MindSpore will sustainably endeavor in innovating, optimizing performance and stability, and introducing advanced deep learning technologies to meet ever-changing requirements. MindSpore, as a young open-source deep learning framework, has a bright prospect and needs the joint efforts from the community to achieve greater success. As more developers are joining us, more innovations and achievements can be foreseen. Your participation is welcome.