基于SwinT-MFPN的高分辨率边坡场景图像分类  

High-Resolution Slope Scene Image Classification Based on SwinT-MFPN

在线阅读下载全文

作  者:涂印 李登华[2,3] 丁勇 Tu Yin;Li Denghua;Ding Yong(College of Science,Nanjing University of Technology,Nanjing 210094,Jiangsu,China;Nanjing Institute of Water Resources Science,Nanjing 210024,Jiangsu,China;Key Laboratory of Reservoir Dam Safety,Ministry of Water Resources,Nanjing 210024,Jiangsu,China)

机构地区:[1]南京理工大学理学院,江苏南京210094 [2]南京水利科学研究院,江苏南京210024 [3]水利部水库大坝安全重点实验室,江苏南京210024

出  处:《激光与光电子学进展》2024年第22期455-465,共11页Laser & Optoelectronics Progress

基  金:国家重点研发计划(2022YFC3005502);国家自然科学基金长江水科学研究联合基金(U2240221);国家自然科学基金(51979174)。

摘  要:针对高分辨率图像计算复杂度快速增长和收敛速度慢等难题,基于Swin-Transformer及特征金字塔网络(FPN)提出一种兼顾性能、推理速度和收敛速度的SwinT-MFPN边坡场景图像分类模型。首先在FPN中引入Mish激活函数构建MFPN结构,对原高分辨图像进行特征提取,得到长宽减小的特征图,并剔除部分底层冗余特征信息,强化关键特征;然后引入深层次特征提取能力强的Swin-Transformer作为模型的主干特征提取网络,并使用加权交叉熵损失函数替换Swin-Transformer的原始交叉熵损失函数,优化由于类别数据量不平衡对模型预测产生的影响。提出精度均方根误差评价指标,并基于自建的大坝边坡数据集,验证了所提模型的稳定性。实验结果表明,所提模型的平均精度均值(mAP)高达95.48%,时间性能提升了3.01%,优于大部分主流模型,证明了所提模型的适用性和有效性。This paper proposes a SwinT-MFPN slope scene image classification model designed to balance performance,inference speed,and convergence speed,leveraging the Swin-Transformer and feature pyramid network(FPN).The proposed model overcomes the challenges associated with rapidly increasing computational complexity and slow convergence in high-resolution images.First,the Mish activation function is introduced into the FPN to construct an MFPN structure that extracts features from the original high-resolution image,producing a feature map with reduced dimensions while eliminating redundant low-level feature information to enhance key features.The Swin-Transformer,which is known for its robust deep-level feature extraction capabilities,is then employed as the model’s backbone feature extraction network.The original cross-entropy loss function of the Swin-Transformer is replaced by a weighted cross-entropy loss function to mitigate the effects of imbalanced class data on model predictions.In addition,a root mean square error evaluation index for accuracy is proposed.The proposed model’s stability is verified using a self-constructed dam slope dataset.Experimental results demonstrate that the proposed model achieves a mean average precision of 95.48%,with a 3.01%improvement in time performance compared to most mainstream models,emphasizing its applicability and effectiveness.

关 键 词:高分辨率 图像分类 边坡 特征金字塔网络 Swin-Transformer 

分 类 号:TP751[自动化与计算机技术—检测技术与自动化装置]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象