融合Swin Transformer的立体匹配方法STransMNet  被引量:1

STransMNet:a stereo matching method with swin transformer fusion

在线阅读下载全文

作  者:王高平 李珣[1,2] 贾雪芳 李哲文 王文杰 Wang Gaoping;Li Xun;Jia Xuefang;Li Zhewen;Wang Wenjie(School of Electronics and Information,Xi'an Polytechnic University,Xi'an,Shaanxi 710600,China;Xi'an Polytechnic University Branch of Shaanxi Artificial Intelligence Joint Laboratory,Xi'an,Shaanxi 710600,China)

机构地区:[1]西安工程大学电子信息学院,陕西西安710600 [2]陕西省人工智能联合实验室西安工程大学分部,陕西西安710600

出  处:《光电工程》2023年第4期74-86,共13页Opto-Electronic Engineering

基  金:国家自然科学基金资助项目(61971339);陕西省自然科学基础研究计划项目(2022JM407)。

摘  要:针对基于CNN的立体匹配方法中特征提取难以较好学习全局和远程上下文信息的问题,提出一种基于Swin Transformer的立体匹配网络改进模型(stereo matching net with swin transformer fusion,STransMNet)。分析了在立体匹配过程中,聚合局部和全局上下文信息的必要性和匹配特征的差异性。改进了特征提取模块,把基于CNN的方法替换为基于Transformer的Swin Transformer方法;并在Swin Transformer中加入多尺度特征融合模块,使得输出特征同时包含浅层和深层语义信息;通过提出特征差异化损失改进了损失函数,以增强模型对细节的注意力。最后,在多个公开数据集上与STTR-light模型进行了对比实验,误差(End-Point-Error,EPE)和匹配错误率3 px error均有明显降低。Feature extraction in the CNN-based stereo matching models has the problem that it is difficult to learn global and long-range context information.To solve this problem,an improved model STransMNet stereo matching network based on the Swin Transformer is proposed in this paper.We analyze the necessity of the aggregated local and global context information.Then the difference in matching features during the stereo matching process is discussed.The feature extraction module is improved by replacing the CNN-based algorithm with the Transformer-based Swin Transformer algorithm to enhance the model's ability to capture remote context information.The multi-scale fusion module is added in Swin Transformer to make the output features contain shallow and deep semantic information.The loss function is improved by introducing the feature differentiation loss to enhance the model's attention to details.Finally,the comparative experiments with the STTR-light model are conducted on multiple public datasets,showing that the End-Point-Error(EPE)and the matching error rate of 3 px error are significantly reduced.

关 键 词:立体匹配 Swin Transformer 深度学习 STransMNet 

分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象