检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:陈鹏 邰彬 石英[3] 金杨 孔力 汪进锋 CHEN Peng;TAI Bin;SHI Ying;JIN Yang;KONG Li;WANG Jinfeng(Electric Power Research Institute of Guangdong Power Grid Co.,Ltd.,Guangzhou 510080,Guangdong Province,China;Key Laboratory of Power Equipment Reliability Enterprises in Guangdong Province,Guangzhou 510080,Guangdong Province,China;School of Automation,Wuhan University of Technology,Wuhan 430070,Hubei Province,China)
机构地区:[1]广东电网有限责任公司电力科学研究院,广东省广州市510080 [2]广东省电力装备可靠性企业重点实验室,广东省广州市510080 [3]武汉理工大学自动化学院,湖北省武汉市430070
出 处:《电网技术》2023年第10期4367-4375,共9页Power System Technology
基 金:南方电网公司科技项目(036100KK52200021(GDKJXM20200443))。
摘 要:随着智能电网建设的全面展开,产生了大量与设备缺陷相关的电力设备缺陷文本,蕴含着故障类型、故障原因及设备消缺方法等关键信息,是电力领域的研究热点。但缺陷文本存在着体量大、多源异构和内容杂乱冗余的问题,目前缺乏对其进行高效整合利用的方法。针对以上问题,该文基于BERT(bidirectional encoder representation from transformers)模型对命名实体抽取技术展开研究。一方面,增加了双向长短期记忆(bi-directional long short-term memory,Bi-LSTM)层进一步提取文本语义信息;另一方面,采用条件随机场(conditional random field,CRF)替换了BERT的输出层,克服了预测标签的局部最优问题。最后融合以上2种策略提出了改进BERT算法,即将BERT与双向长短记忆网络和条件随机场相结合,实现了缺陷文本的命名实体抽取。实验结果表明,改进BERT算法在7类实体上均取得了较高的F1值(精确率和召回率的加权调和平均值)。与BERT相比,实体抽取的总体精确率和召回率分别提升了0.94%和0.95%。With the development of the smart grid construction,a large number of power equipment defect texts have been generated,which contains a lot of key information such as the fault types,the fault causes and the equipment defect elimination methods,which is a research hotspot in the field of electric power.However,these defective texts are large in volume,multi-source in heterogeneity,and cluttered and redundant in content,and there is currently no proper method for its efficient integration and utilization.In view of the above problems,this paper studies the named entity recognition technology based on the BERT model.On the one hand,a BI-LSTM layer is added to further extract the textual semantic information,on the other hand,the CRF is used to replace the output layer of the BERT,which overcomes the local optimal problem of the predicting labels.Finally,combining the above two strategies,an improved BERT algorithm is proposed,which realizes the named entity recognition of the defective texts.The experimental results show that the improved BERT algorithm achieves higher F1 values on 7 types of entities.Compared with the single BERT,the overall precision and the recall of entity extraction are improved by 0.94%and 0.95%,respectively.
关 键 词:电力设备缺陷文本 命名实体抽取 改进BERT算法 语义信息 输出层 局部最优
分 类 号:TM721[电气工程—电力系统及自动化]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7