注意力机制与神经渲染的多视图三维重建算法  被引量:1

Attention mechanism and neural rendering for Multi-View 3D reconstruction algorithm

在线阅读下载全文

作  者:朱代先[1] 孔浩然 秋强 刘树林[2] 张亚莉 Zhu Daixian;Kong Haoran;Qiu Qiang;Liu Shulin;Zhang Yali(School of Communication and Information Engineering,Xi′an University of Science and Technology,Xi′an 710054,China;School of Electrical and Control Engineering,Xi′an University of Science and Technology,Xi′an 710054,China)

机构地区:[1]西安科技大学通信与信息工程学院,西安710054 [2]西安科技大学电气与控制工程学院,西安710054

出  处:《电子测量技术》2024年第5期158-166,共9页Electronic Measurement Technology

基  金:国家自然科学基金(51774235);陕西省重点研发计划项目(2021GY-338);西安市碑林区科技计划项目(GX2333)资助。

摘  要:针对多视图立体网络在弱纹理或非朗伯曲面等挑战性区域重建效果差的问题,首先提出一个基于3个并行扩展卷积和注意力机制的多尺度特征提取模块,在增加感受野的同时捕获特征之间的依赖关系以获取全局上下文信息,从而提升多视图立体网络在挑战性区域特征的表征能力以进行鲁棒的特征匹配。其次在代价体正则化3D CNN部分引入注意力机制,使网络注意于代价体中的重要区域以进行平滑处理。另外建立一个神经渲染网络,该网络利用渲染参考损失精确地解析辐射场景表达的几何外观信息,并引入深度一致性损失保持多视图立体网络与神经渲染网络之间的几何一致性,有效地缓解有噪声代价体对多视图立体网络的不利影响。该算法在室内DTU数据集中测试,点云重建的完整性和整体性指标分别为0.289和0.326,与基准方法CasMVSNet相比,分别提升24.9%和8.2%,即使在挑战性区域也得到高质量的重建效果;在室外Tanks and Temples中级数据集中,点云重建的平均F-score为60.31,与方法UCS-Net相比提升9.9%,体现出较强的泛化能力。Aiming at the problem of poor reconstruction of Multi-View Stereo Networks in challenging regions such as weak textures or non-Lambertian surfaces,this paper first proposes a multi-scale feature extraction module based on three parallel dilated convolution and attention mechanism,which enables the network to capture the dependencies between features while increasing the sensory field to obtain global context information,thus enhancing the multi-view stereo network′s ability to characterize features in challenging regions for robust feature matching.Secondly,an attention mechanism is introduced in the 3D CNN part of the cost volume regularization so that the network pays attention to the important regions in the cost volume for smoothing.Additionally,a neural rendering network is built,which utilizes the rendering reference loss to accurately resolve the geometric appearance information expressed by the radiance field and introduces the depth consistency loss to maintain the geometric consistency between the multi-view stereo network and the neural rendering network,which effectively mitigates the detrimental effect of the noisy cost volume on the multi-view stereo network.The algorithm is tested in the indoor DTU dataset,achieving completeness and overall metrics of 0.289 and 0.326,respectively.Compared to the benchmark method CasMVSNet,there is an improvement of 24.9%and 8.2%in the two metrics,demonstrating high-quality reconstruction even in challenging regions.In the outdoor Tanks and Temples intermediate dataset,the average F-score for point cloud reconstruction is 60.31,showing a 9.9%improvement over the UCS-Net method.This reflects the algorithm′s strong generalization capability.

关 键 词:多视图立体网络 三维重建 注意力机制 神经渲染 

分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象