基于语义框架的电网缺陷文本挖掘技术及其应用  被引量:86

Semantic Framework-Based Defect Text Mining Technique and Application in Power Grid

在线阅读下载全文

作  者:曹靖[1] 陈陆燊 邱剑[1] 王慧芳[1] 应高亮 张波 

机构地区:[1]浙江大学电气工程学院,浙江省杭州市310027 [2]国网浙江金华供电公司,浙江省金华市321017

出  处:《电网技术》2017年第2期637-643,共7页Power System Technology

摘  要:电网企业拥有大量蕴含着重要可靠性信息的设备缺陷文本,依靠人工进行挖掘不仅效率低而且准确性因人而异。以变压器缺陷文本为研究对象,通过分析文本的特点,建立了基于语义框架的电网缺陷文本挖掘模型,解决了缺陷文本句子成分难以划分、数字量无法精确提取等问题,为电网领域的非结构化数据挖掘提供了新技术。首先在建立本体词库基础上,对缺陷文本进行分词、词汇特征提取等预处理;然后定义了电力语义框架与语义槽,提出了槽填充和语义框架构建流程,并通过词串合并实现了本体字典自动完善;最后对缺陷文本挖掘结果在可靠性统计中的应用进行了研究。算例表明,所提出的挖掘技术应用于电网缺陷自动分类与统计中,具有可行性和有效性。Power grid enterprises have large amounts of equipment defect texts in Chinese, containing important reliability information. It is of low efficiency and uncertain accuracy to mine information hiding behind the texts manually. Taking transformer defect texts as study object, after analyzing text characteristics, a defect text mining model is established based on semantic framework. The model provides a new technology for unstructured data mining in power grid domain because it solves problems of segmenting sentence elements of defect texts and extracting digital information precisely. Firstly, defect texts are pretreated based on established ontology thesaurus, such as segmentation and feature extraction. Then, power semantic framework and semantic slots are defined, process of slot-filling and semantic framework construction is raised, and ontology dictionary is auto-perfected by merging word series. Finally, application of defect text mining results in statistical reliability is studied. Example shows that the proposed mining technology is feasible and effective when applied to automatic classification and statistics of grid defect.

关 键 词:文本挖掘 语义框架 可靠性统计 缺陷文本 

分 类 号:TM72[电气工程—电力系统及自动化]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象