基于改进Swin-Transformer模型的铜矿X射线图像分类研究

Research on Copper Mine X-ray Image Classification Based on Improved Swin Transformer Model

作　　者：黄永进何剑锋[1,2] 李卫东[1,2] 夏菲[1] 王杉汪雪元[1,2] 钟国韵瞿金辉[1] HUANG Yongjin;HE Jianfeng;LI Weidong;XIA Fei;WANG Shan;WANG Xueyuan;ZHONG Guoyun;QU Jinhui(Jiangxi Engineering Technology Research Center of Nuclear Geoscience Data Science and System,East China University of Technology,Nanchang 330013,China;Information Engineering College,East China University of Technology,Nanchang 330013,China;Ganzhou Good Friend Technology Co.,Ltd.,Ganzhou 3410oo,Jiangxi,China)

机构地区：[1]东华理工大学江西省核地学数据科学与系统工程技术研究中心,南昌330013 [2]东华理工大学信息工程学院,南昌330013 [3]赣州好朋友科技有限公司,江西赣州341000

出　　处：《有色金属（选矿部分）》2024年第12期112-118,138,共8页Nonferrous Metals（Mineral Processing Section）

基　　金：国家自然科学基金资助项目(U2067202);江西省重点研发计划项目(20203BBG73069);江西省主要学科学术和技术带头人培养计划项目(20225BCJ22004)。

摘　　要：针对铜矿图像分类中传统神经网络因感受野限制和维度信息阻塞面临的问题,提出了基于X射线透射成像技术改进的Swin-Transformer模型。该模型以Swin-Transformer为基础框架,在主干网络的第二和第三阶段中添加Mixing Block,通过局部窗口自注意力和深度卷积之间的的双向交互,使模型的感受野得到显著增大,从而增强了特征表示和建模能力;同时,引入的EMA(Efficient Multi-Scale Attention)模块,将部分通道重塑为批量维度,并将通道维度分组为多个子特征,使空间语义特征在每个特征组中均匀分布,提升了模型在通道和多尺度空间维度信息融合方面的能力,并增强了对感兴趣区域特征的表征效果。试验以铜矿X射线透射图像为研究对象,选取总计5000张图像,按8∶2划分训练集和测试集,在与传统网络的性能对比试验中选取Swin-Transformer作为模型的主干网络。在选取主干网络的基础之上向模型引入Mixing Block模块和EMA模块进行优化改进。试验结果表明,改进模型解决了感受野和维度信息受限的问题,并在铜矿智能识别任务上达到了94.40%的准确率,而消融试验则证明了改进模块对于模型识别性能的提升,进一步证明了改进方法的有效性。Aiming at the problems faced by traditional neural networks in copper mine image classification due to receptive field limitations and dimensional information blockage,an improved Swin-Transformer model based on X-ray transmission imaging technology was proposed.This model is based on the Swin-Transformer framework and adds Mixing Blocks in the second and third stages of the backbone network.Through the bidirectional interaction between local window self attention and deep convolution,the receptive field of the model is significantly increased,thereby enhancing feature representation and modeling capabilities.At the same time,the introduction of the EMA(Efficient Multi Scale Attention)module reshapes some channels into batch dimensions and groups channel dimensions into multiple sub features,making spatial semantic features evenly distributed in each feature group,improving the model's ability in channel and multi-scale spatial dimension information fusion,and enhancing the representation effect of region of interest features.The experiment took X-ray transmission images of copper mines as the research object,selected a total of 5000 images,divided the training and testing sets in an 8:2 ratio,and selected Swin Transformer as the backbone network of the model in the performance comparison experiment with traditional networks.On the basis of selecting the backbone network,the Mixing Block module and EMA module were introduced to the model for optimization and improvement.The experimental results showed that the improved model solved the problems of limited receptive field and dimensional information,and achieved an accuracy of 94.40%in the intelligent recognition task of copper mines.The ablation experiment proved the improvement of the model recognition performance by the improved module,and further proved the effectiveness of the improved method.

关键词：深度学习 X射线成像矿石识别 Swin-Transformer

分类号：TD921[矿业工程—选矿]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于改进Swin-Transformer模型的铜矿X射线图像分类研究

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于改进Swin-Transformer模型的铜矿X射线图像分类研究

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索