基于双分支特征融合的电力设备缺陷文本挖掘方法  被引量:3

Text Mining Method for Power Equipment Defects Based on Two-branch Feature Fusion

在线阅读下载全文

作  者:张中文 吐松江·卡日[1] 张紫薇 崔传世 邵罗 ZHANG Zhongwen;TUSONGJIANG Kari;ZHANG Ziwei;CUI Chuanshi;SHAO Luo(School of Electrical Engineering,Xinjiang University,Urumqi 830049,China;Sichuan Energy Internet Research Institute,Chengdu 610299,China)

机构地区:[1]新疆大学电气工程学院,乌鲁木齐830049 [2]清华四川能源互联网研究院,成都610299

出  处:《高压电器》2024年第6期188-196,共9页High Voltage Apparatus

基  金:新疆维吾尔自治区自然科学基金面上项目资助(2022D01C35);国家自然科学基金项目资助(52067021,52207165)。

摘  要:针对电力设备缺陷文本信息的知识挖掘与分析任务中存在缺陷文本特征信息提取不足、缺陷文本分类精度不够的问题,提出一种基于BERT(bidirectional encoder representations from transformers)的双分支特征融合的电力设备缺陷文本分类模型。首先,对缺陷文本数据进行预处理,删除异常缺陷文本,并归纳了电力设备缺陷文本特点;然后,采用BERT模型作为文本编码器,将文本转化为向量后分别输入至BiLSTMAttention(attention-based bidirectional long short-term memory)模块和多分支CNN(multi-scale convolutional neural network,MCNN)模块,提取缺陷文本语义信息特征和局部关键信息特征;最后,将所提取出的语义特征和多维关键特征向量进行融合,并通过Softmax层实现对缺陷文本分类。与基准模型BERT-BiLSTMAttention相比,其准确率、召回率及F1值分别提高了2.76%、3.58%和4.39%,表明所建模型在缺陷文本分类任务中性能的优越性。Aiming at the problem of insufficient extraction of defective text feature information and insufficient accuracy of defective text classification in the task of knowledge mining and analysis of defective text information of electric power equipment,a kind of two-branch feature fusion based on BERT(bidirectional encoder representations from transformers)is proposed as a text classification model for electric power equipment defective texts.Firstly,the defective text data is preprocessed to delete abnormal defective text and summarize the characteristics of defective text of power equipment.Then,the BERT model is adopted as a text encoder,and the text is transformed into vectors and input into BiLSTM-Attention(attention-based bidirectional long short-term memory)module and multi-branch CNN(multi-scale convolutional neural network,MCNN)module to extract the semantic information features and key information features of the defective text;Finally,the extracted semantic features and multi-dimensional key feature vectors are fused and the Softmax layer is used to achieve the defective text classification of the defective text.Compared with the benchmark model BERT-BiLSTM-Attention,its accuracy,recall,and F1 value are improved by 2.76%,3.58%,and 4.39%,showing the superior performance of the proposed model in defective text categorization task.

关 键 词:预训练模型 多维特征提取 语义信息特征 缺陷文本分类 

分 类 号:TM50[电气工程—电器] TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象