面向知识获取的电力科技领域语言模型研究

Research on Language Model for Knowledge Acquisition in the Field of Electric Power Technology

作　　者：徐翀王其清 XU Chong;WANG Qiqing(State Grid Energy Research Institute Co.,Ltd.,Changping District,Beijing 102209,China)

机构地区：[1]国网能源研究院有限公司,北京市昌平区102209

出　　处：《电力信息与通信技术》2023年第4期31-36,共6页Electric Power Information and Communication Technology

基　　金：国家电网有限公司总部科技项目资助“基于知识图谱的科技咨询专家智能优选技术研究与开发”(1400-202057269A-0-0-00)。

摘　　要：为克服电力科技文本专业化、跨学科特点给知识获取带来的挑战,提出构建电力科技领域语言模型,实现更准确的文本表示。文章收集大量电力科技论文、专利、项目等文本,基于Transformer模型预训练得到领域语言模型,设计电力科技术语分类和电力科技远程监督实体关系抽取2类知识抽取任务进行模型验证,实验结果表明,所提领域语言模型在术语分类任务上的F1分数较word2vec基准模型提升超过10%,在实体关系抽取任务上的AUC分数比BERT语言模型基准模型提升约2%,所提模型有利于为下游知识获取任务提供更高质量特征表示。To overcome the challenges of knowledge acquisition brought by the specialization and interdisciplinary characteristics of electric power science and technology texts,a power technology language model is proposed to achieve a more accurate text representation.The Transformer-based language model is pre-trained on large-scale power technology papers,patents,projects,and other texts.Two tasks including power science and technology term classification and distantly supervised entity relation extraction are proposed for verifying the model.Experiment results show that the F1-score of the proposed domain language model on the term classification task is more than 10%higher than that of the word2vec benchmark model,and the AUC score on the entity relation extraction task is about 2%higher than the BERT benchmark model.The proposed language model is beneficial to provide higher-quality feature representations for downstream knowledge acquisition tasks.

关键词：电力科技知识获取语言模型自然语言处理

分类号：TP391.1[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

面向知识获取的电力科技领域语言模型研究

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

面向知识获取的电力科技领域语言模型研究

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索