检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张中文 吐松江·卡日[1] 张紫薇 崔传世 邵罗 ZHANG Zhongwen;TUSONGJIANG Kari;ZHANG Ziwei;CUI Chuanshi;SHAO Luo(School of Electrical Engineering,Xinjiang University,Urumqi 830049,China;Sichuan Energy Internet Research Institute,Chengdu 610299,China)
机构地区:[1]新疆大学电气工程学院,乌鲁木齐830049 [2]清华四川能源互联网研究院,成都610299
出 处:《高压电器》2024年第6期188-196,共9页High Voltage Apparatus
基 金:新疆维吾尔自治区自然科学基金面上项目资助(2022D01C35);国家自然科学基金项目资助(52067021,52207165)。
摘 要:针对电力设备缺陷文本信息的知识挖掘与分析任务中存在缺陷文本特征信息提取不足、缺陷文本分类精度不够的问题,提出一种基于BERT(bidirectional encoder representations from transformers)的双分支特征融合的电力设备缺陷文本分类模型。首先,对缺陷文本数据进行预处理,删除异常缺陷文本,并归纳了电力设备缺陷文本特点;然后,采用BERT模型作为文本编码器,将文本转化为向量后分别输入至BiLSTMAttention(attention-based bidirectional long short-term memory)模块和多分支CNN(multi-scale convolutional neural network,MCNN)模块,提取缺陷文本语义信息特征和局部关键信息特征;最后,将所提取出的语义特征和多维关键特征向量进行融合,并通过Softmax层实现对缺陷文本分类。与基准模型BERT-BiLSTMAttention相比,其准确率、召回率及F1值分别提高了2.76%、3.58%和4.39%,表明所建模型在缺陷文本分类任务中性能的优越性。Aiming at the problem of insufficient extraction of defective text feature information and insufficient accuracy of defective text classification in the task of knowledge mining and analysis of defective text information of electric power equipment,a kind of two-branch feature fusion based on BERT(bidirectional encoder representations from transformers)is proposed as a text classification model for electric power equipment defective texts.Firstly,the defective text data is preprocessed to delete abnormal defective text and summarize the characteristics of defective text of power equipment.Then,the BERT model is adopted as a text encoder,and the text is transformed into vectors and input into BiLSTM-Attention(attention-based bidirectional long short-term memory)module and multi-branch CNN(multi-scale convolutional neural network,MCNN)module to extract the semantic information features and key information features of the defective text;Finally,the extracted semantic features and multi-dimensional key feature vectors are fused and the Softmax layer is used to achieve the defective text classification of the defective text.Compared with the benchmark model BERT-BiLSTM-Attention,its accuracy,recall,and F1 value are improved by 2.76%,3.58%,and 4.39%,showing the superior performance of the proposed model in defective text categorization task.
关 键 词:预训练模型 多维特征提取 语义信息特征 缺陷文本分类
分 类 号:TM50[电气工程—电器] TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222