检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王绪亮 顾媛丽 张鸿儒[1] 刘灵慧 刘洪顺[1] 李清泉[1] WANG Xuliang;GU Yuanli;ZHANG Hongru;LIU Linghui;LIU Hongshun;LI Qingquan(Shandong Provincial Key Laboratory of UHV Transmission Technology and Equipment(Shandong University),Jinan 250061,Shandong Province,China;State Grid Shandong Electric Power Company Laiwu Power Supply Company,Jinan 271100,Shandong Province,China)
机构地区:[1]特高压输变电技术与装备山东省重点实验室(山东大学),山东省济南市250061 [2]国网山东省电力公司莱芜供电公司,山东省济南市271100
出 处:《电网技术》2024年第4期1690-1699,I0082,I0083,I0084,共13页Power System Technology
基 金:国网山东省电力公司科技项目(520612220004)。
摘 要:当前电网数字化转型升级,电力设备智能运维技术快速发展,在运维过程中积累了大量包含电网重要信息的电力设备缺陷文本。由于文本数据标签稀疏,以及描述语言的模糊性、差异性等问题,电力文本中的运维信息难以被有效挖掘。文章提出了一种针对电力设备缺陷文本的数据增强方法。首先,使用缺陷文本数据集微调预训练模型ERNIE(enhanced representation through knowledge integration),应用多阶段知识掩码策略将电气领域专业知识集成到对缺陷文本的动态编码中;然后在流形假设的基础上基于降噪自动编码器架构设计破坏函数和重建函数,遵循基于信息价值的掩码单元选择策略构建破坏函数,基于微调过的ERNIE构建重建函数,在“破坏-重建”过程中获得位于原始数据流形范围内的增强样本;其次对增强数据集基于影响函数和多样性度量进行数据选择,过滤掉数据质量差和重复度高的增强样本;最后通过多层训练框架,将增强数据应用于各种缺陷文本挖掘任务。算例基于真实设备巡检、检修记录构建了电力设备缺陷文本等级分类任务。结果表明,所提出的算法对缺陷文本挖掘效果有较大提升,并且可以广泛灵活地应用在多种电力设备缺陷文本挖掘任务中。With the digital transformation and upgrade of the power grids,the intelligent operation and maintenance technology of the power equipment has developed rapidly.During the operation and maintenance process,a large number of defect texts containing important information of the power grids have been accumulated.Due to the sparseness of text data labels,as well as the fuzziness and diversity of the literal descriptions,it is difficult to effectively mine the operation and maintenance information in power texts.A data augmentation of the defect texts for the power equipment is proposed.Firstly,the defect text data sets are used to fine-tune the pre-training model ERNIE(enhanced representation through knowledge integration)with the multi-stage knowledge mask strategy,integrating electrical expertise into dynamic encoding of defect texts.Secondly,on the basis of manifold assumption,the destruction and reconstruction functions are designed based on the denoising autoencoder.The destruction function is constructed according to the mask unit selection strategy based on the information value,and the reconstruction function is constructed based on the fine-tuned ERNIE.The enhanced samples are obtained during the process of the destruction and reconstruction.Then,the augmented data is selected based on the influence function and the diversity measures,filtering out the samples with poor data quality and high repetition.Finally,the augmented data is applied to various text mining tasks through a multi-layer training framework.Results show that the algorithm is able to greatly improve the effect of the defect text mining,and can be widely and flexibly applied in a variety of power equipment defect text mining tasks.
分 类 号:TM721[电气工程—电力系统及自动化]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.218.181.138