基于篇章级事件表示的文本相关度计算方法  被引量:2

Text correlation calculation based on passage-level event representation

在线阅读下载全文

作  者:刘铭[1,2] 郑子豪 秦兵 刘一仝[3] 李阳 Ming LIU;Zihao ZHENG;Bing QIN;Yitong LIU;Yang LI(School of Computer Science and Technology,Harbin Institute of Technology,Harbin 150001,China;PENG CHENG Laboratory,Shenzhen 518066,China;Tencent Technology(Beijing)Co.,Ltd.,Beijing 100193,China)

机构地区:[1]哈尔滨工业大学计算机科学与技术学院,哈尔滨150001 [2]鹏城实验室,深圳518066 [3]腾讯科技(北京)有限公司,北京100193

出  处:《中国科学:信息科学》2020年第7期1033-1054,共22页Scientia Sinica(Informationis)

基  金:科技创新2030——“新一代人工智能”重大项目(批准号:2018AAA0101901);国家重点研发计划项目(批准号:2018YFB100-5103);国家自然科学基金重点项目(批准号:61632011);国家自然科学基金面上项目(批准号:61772156,61976073);黑龙江省面上项目(批准号:F2018013)资助。

摘  要:随着网络信息的剧增,信息流服务备受用户关注.在信息流服务中,如何衡量文本之间的相关度进而从多来源的信息渠道中过滤掉冗余信息提升推荐满意度成为至关重要的环节.当前主流的文本相关度计算方法均是将文本表示为向量,进而通过衡量向量之间的相似度来度量文本间的相关度.然而,信息流中的文本多为新闻文本,这些文本的核心是其描述的事件,基于此需要从事件的角度挖掘文本的核心特征进而利用其计算文本间的相关度.当前针对事件的研究大多数着眼于句子级别.事实上,在计算文本相关度时,需要从篇章级别把握文章的内容.故此,篇章级的事件分析更有影响力.为此,本文在句子级事件抽取的基础上,提出了一种篇章级的事件表示方法,其利用句子级事件的抽取结果构建篇章事件连通图,并选取图中重要的节点作为篇章级事件的代表,之后利用篇章级的事件表示结果来度量文本之间的相关度.实验显示,本文提出的文本相关度计算方法要远好于传统的文本相关度计算方法.Along with the explosion of web information,information flow service has attracted the attention of users.In this kind of service,how to measure the correlation between texts and further filter the redundant information collected from multiple sources becomes the key solution to meet the user’s desire.Recently,the popular text correlation calculation methods mostly represent text as vector and then measure text similarity as text correlation.However,in information flow service,most of the texts are news,and the core element in a news is the event it stated.Therefore,we need a way to extract the core features that are related to the event stated by text,so we can accurately calculate text correlation via these extracted features.Unfortunately,recent event-related researches focus on the sentence-level.To calculate text correlation,we need to grasp the content of the text from the passage-level,which indicates that passage-level event analysis has more impact.To this end,we propose a passage-level event representation method based on sentence-level event extraction.It constructs a passage-level event connection graph based on the extracting results obtained from sentences.After that,it selects the important nodes in the graph as the representations of the passage-level events.Based on the passage-level representations,we can acquire text correlation.Experimental results indicate that our method outperforms conventional text correlation calculation methods.

关 键 词:篇章事件连通图 篇章级事件相关度 文本排序 关键子句筛选 子句连通图 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象