检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:寇旗旗 王伟臣 韩成功 吕晨 程德强 姬玉成 KOU Qiqi;WANG Weichen;HAN Chenggong;LÜChen;CHENG Deqiang;JI Yucheng(School of Computer Science and Technology,China University of Mining and Technology,Xuzhou 221116,China;School of Information and Control Engineering,China University of Mining and Technology,Xuzhou 221116,China;Department Big Data Center,Ministry of Emergency Management,Beijing 100013,China)
机构地区:[1]中国矿业大学计算机科学与技术学院,江苏徐州221116 [2]中国矿业大学信息与控制工程学院,江苏徐州221116 [3]应急管理部大数据中心,北京100013
出 处:《光学精密工程》2024年第24期3603-3615,共13页Optics and Precision Engineering
基 金:国家自然科学基金项目(No.52204177);徐州市基础研究计划青年科技人才项目(No.KC23026)。
摘 要:针对目前的深度估计网络对室外场景下图像的空间特征提取不够充分的问题,导致输出深度图存在物体边缘失真、模糊和区域伪影等缺陷,本文提出了一种多尺度特征增强的多帧自监督单目深度估计模型。首先,该模型编码器引入大核注意力的激活模块,提高编码器对输入图像全局空间特征的提取能力,保留空间上下文信息;同时,提出了一种结构增强模块,使其能够在通道维度上判别重要特征,增强网络对图像结构特征的感知能力;最后,解码器中使用动态上采样方法代替近邻插值的上采样方法,恢复细节信息,优化了深度图的伪影现象。实验结果表明,本文提出的深度估计网络在KITTI和CityScapes室外数据集的测试结果优于目前的主流算法,尤其在KITTI数据集上的预测正确率达到90.3%。可视化结果也表明,本文网络模型生成的深度图边缘更加清晰准确,有效地提高了深度估计网络的预测精度。The current depth estimation networks do not sufficiently extract spatial features from images in outdoor scenes,leading to issues such as object edge distortion,blurriness,and regional pseudo-shadows in the output depth maps.To address these problems,this paper proposed a multi-frame self-supervised monocular depth estimation model with multi-scale feature enhancement.Firstly,the model's encoder in⁃corporated an activation module based on large kernel attention to enhance its ability to extract global spa⁃tial features from the input image,preserving more spatial context information.Simultaneously,a structur⁃al enhancement module was introduced that can discriminate important features across channel dimen⁃sions,enhancing the network's perception of the structural characteristics of the image.Finally,the decod⁃er used a dynamic upsampling method instead of the traditional nearest interpolation upsampling method to restore detailed information,thereby optimizing the pseudo-shadow phenomenon in the depth map to some extent.Experimental results demonstrate that the depth estimation network proposed in this paper outper⁃forms current mainstream algorithms in tests on the KITTI and CityScapes outdoor datasets,particularly achieving a prediction accuracy rate of 90.3%on the KITTI dataset.Visualization results also indicate that the depth maps generated by our network model have clearer and more precise edges,effectively im⁃proving the prediction accuracy of the depth estimation network.
关 键 词:单目深度估计 自监督 多帧 大核注意力 特征增强
分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.224.184.62