电网设备缺陷文本的质量评价与提升方法  被引量:35

Quality Assessment and Improvement Method for Power Grid Equipment Defect Text

在线阅读下载全文

作  者:邵冠宇 王慧芳[1] 何奔腾[1] SHAO Guanyu;WANG Huifang;HE Benteng(College of Electrical Engineering,Zhejiang University,Hangzhou 310027,Zhejiang Province,China)

机构地区:[1]浙江大学电气工程学院,浙江省杭州市310027

出  处:《电网技术》2019年第4期1472-1479,共8页Power System Technology

摘  要:文本质量直接影响着文本挖掘效果的优劣。在总结电网企业缺陷文本存在的质量问题基础上,提出了缺陷文本质量评价和提升方法。首先,通过对大量实际缺陷文本的分析,总结出电网设备缺陷文本的格式及容易出现的不完整、不具体、冗余度过高等问题。然后,基于相应问题,定义了缺陷文本质量的评价指标,并提出了基于"层次-自适应灰色关联分析法"的评价方法。接下来,针对历史缺陷文本中质量较差和缺陷等级与缺陷描述不匹配的文本,利用潜在狄利克雷分布方法,结合国家电网有限公司的缺陷分类标准,进行修正以提升质量;针对新录入文本,利用文本质量评价方法进行质量问题提示,利用词向量映射方法给出修正建议,保证新录入缺陷文本的质量。最后,结合实例对修正前后的缺陷文本进行质量对比,算例表明,修正后的历史缺陷文本在文本质量得分上有较大提升,新录入文本存在的问题也能较为准确地识别并给出对应修正建议。The quality of text directly influences effectiveness of text mining. In this paper, a quality assessment and corresponding improvement method for defect text in power grid are proposed. Firstly, the format of power grid equipment defect text and several existing typical problems,such as incomplete, unspecific or redundant issues, are summarized with analysis of massive actual defect text. Then,three different quality assessment indexes for the defect text are defined, and a hierarchical-adaptive grey relational analysis-based quality assessment method is presented.Furthermore, the latent Dirichlet distribution method, combined with defect classification standards of State Grid Corporation of China, is utilized to improve historical defect text of poor quality, failing to match defect level and defect description. For the new entering text, the text quality assessment method is used to find potential quality problems, and the word vector mapping method is employed to make correction suggestions for improving the quality of the new entry defect text. Finally,comparisons between the revised defect text and the original defect text in terms of quality assessment score are presented.Results show that the quality of the revised history defect text is greatly improved. Also, the quality problems in new entering text can be accurately identified, and corresponding modification suggestions are provided.

关 键 词:电网设备缺陷文本 文本质量评价 层次-自适应灰色关联分析法 文本质量提升 潜在狄利克雷分布 

分 类 号:TM721[电气工程—电力系统及自动化]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象