基于语义依存和外部知识库的关键词抽取  被引量:2

Keyword extraction based on semantic dependency and external knowledge base

在线阅读下载全文

作  者:倪兵 廖光忠[1,2] NI Bing;LIAO Guang-zhong(School of Computer Science and Technology,Wuhan University of Science and Technology,Wuhan 430081,China;Institute of Big Data Science and Engineering,Wuhan University of Science and Technology,Wuhan 430081,China)

机构地区:[1]武汉科技大学计算机科学与技术学院,湖北武汉430081 [2]武汉科技大学大数据科学与工程研究院,湖北武汉430081

出  处:《计算机工程与设计》2022年第3期821-826,共6页Computer Engineering and Design

基  金:国家自然科学基金项目(61673304);国家社会科学基金重大计划基金项目(11&ZD189)。

摘  要:为提升基于TextRank算法的关键词抽取效果,分析中文语义结构和分词算法的特点,提出一种融合语义依存和外部知识库的方法。使用语义依存图代替共现窗口构建词图,增强词图中各节点间的语义联系;在此基础上引入规范化谷歌距离和领域词典这两个外部知识库特征,结合文档内外部信息对词图中的边进行加权计算,对提取出的文档关键词应用前后向匹配算法做进一步处理,使提取的关键词更具语义完整性。实验结果表明,该方法在数据集上的关键词抽取效果有了显著提升,可读性更强,验证了所提方法的有效性。To improve the effectiveness of keyword extraction based on TextRank algorithm,the characteristics of Chinese semantic structure and word segmentation algorithm were analyzed,and a method combining semantic dependency and external knowledge base was proposed.The semantic dependency graph was used to replace the co-occurrence window to construct the word graph to enhance the semantic connection between the nodes in the word graph.On this basis,normalization Google distance and domain dictionary,two external knowledge base features,were introduced,combined with the internal and external information of the document,the weighted calculation of the edges in the word map was carried out,and the forward and backward matching algorithm was applied to further process the extracted keywords in the document,so that the extracted keywords had more semantic integrity.Experimental results show that the proposed method significantly improves the effects of keyword extraction on the data set,and the readability is stronger,which verifies the effectiveness of the proposed method.

关 键 词:抽取 语义依存图分析 外部知识库 前后向匹配算法 特征加权 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象