基于多模态的缺陷绝缘子图像的多标签分类  

Multi-label Classification of Defective Insulator Images Based on Multimodality

在线阅读下载全文

作  者:周景[1] 王满意 田兆星 ZHOU Jing;WANG Manyi;TIAN Zhaoxing(School of Control and Computer Engineering,North China Electric Power University,Beijing 102206,China)

机构地区:[1]华北电力大学控制与计算机工程学院,北京102206

出  处:《高电压技术》2025年第2期642-651,共10页High Voltage Engineering

基  金:国家电网公司科技项目(5108-202218280A-2-400-XG)。

摘  要:对巡检图像中绝缘子缺陷准确分类是输电线路自动巡检领域中的关键技术之一。针对传统深度学习的分类方法对文本信息利用不够充分以及绝缘子图像分类标签较为单一的问题,该文首次提出了一种基于多模态的缺陷绝缘子图像的多标签分类方法。首先,采用一种多模态联合数据增强方法,实现了绝缘子图像和标签文本间跨模态的数据增强。然后,使用Vision Transformer网络提取图像的特征信息和BERT网络提取标签文本的特征信息,充分利用图像和标签文本的特征信息,从不同模态获取全面的信息,提高了网络的分类能力。最后,通过对比学习的方式将图像和文本的特征信息关联,增强网络分类的可靠性的同时,又为分类结果提供了良好的可解释性。实验结果表明,该方法的分类总体准确率达到93.87%,在同一数据集中对比其他模型,分类性能具有明显优势,为多模态技术在电网领域的应用提供了较好的基础。Accurate classification of insulator defects in inspection images is one of the key technologies in the field of automatic inspection of transmission lines.To address the issue of the insufficient utilization of textual information by traditional deep learning classification methods and the issue of relatively simplistic insulator image classification labels,this paper proposes for the first time a multi-label classification method for defective insulator images based on a multi-modal approach.Firstly,a multimodal joint data augmentation method is employed,achieving cross-modal data enhancement between insulator images and label texts.Then,the Vision Transformer network is utilized to extract fea-tures from images,and the BERT network is used to extract features from label texts,fully leveraging the feature information from both images and label texts to obtain comprehensive information from different modalities,thereby en-hancing the network’s classification capabilities.Finally,through correlating the feature information of images and texts via contrastive learning,the reliability of network classification is enhanced,while also providing good interpretability for the classification results.The experimental results demonstrate that this method achieves an overall accuracy rate of 93.87%,showing a significant advantage in classification performance over other models on the same dataset.

关 键 词:绝缘子图像 多标签分类 多模态 对比学习 数据增强 

分 类 号:TP391.41[自动化与计算机技术—计算机应用技术] TM216[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象