相异性在语义相似度计算中的应用  

Application of Semantic Dissimilarity in Similarity Calculation

在线阅读下载全文

作  者:关慧 马天宇[1] 王广伟 GUAN Hui;MA Tian-yu;WANG Guang-wei(Department of Computer Science and Technology,Shenyang University of Chemical Technology,Shenyang 110142,China;Liaoning Key Laborotary of Industrial Intelligence Technology on Chemical Process,Shenyang University of Chemical Technology,Shenyang 110142,China)

机构地区:[1]沈阳化工大学计算机科学与技术学院,辽宁沈阳110142 [2]沈阳化工大学辽宁省化工过程工业智能化技术重点实验室,辽宁沈阳110142

出  处:《沈阳化工大学学报》2022年第2期167-179,共13页Journal of Shenyang University of Chemical Technology

基  金:辽宁省教育厅2021年度科学研究经费项目(LJKZ0434);辽宁省自然科学基金指导项目(201602583)。

摘  要:在语义相似度计算中,以往的研究侧重于从正向计算语义的相似性,即通过概念间的路径长度、包含的信息量以及特征等计算概念之间的相关性,从而得到更高的相似性计算结果,而这些结果往往与人类判断的结果偏差较大.然而,在语义相似度的计算过程越来越趋近于模拟人类思考过程的趋势下,考虑语义之间的相异性就变得十分重要.鉴于此,本研究从反向考虑提出了一种将语义之间的相异性加入到语义相似度计算的方法.该方法通过WordNet语料库特有的层次结构深度挖掘概念之间的反义关系,然后用4种不同的策略分别将反义关系代表的相异性以反义因子的形式与已有的方法相结合,通过复现已有方法并将其与反义因子结合进行实验得到最终的语义相似度结果.针对提出的基于相异性的语义相似度计算模型,进一步对模型的参数和相关系数进行了分析和讨论.实验结果表明提出的模型相较于其他方法与人类判断结果具有更高的相关系数,并且该模型可以很好地提升已有的基于路径距离的语义相似度计算方法的准确性.In the calculation of semantic similarity, previous studies focused on the forward calculation of semantic similarity, that is, the correlation between concepts is calculated through the path length between concepts, the amount of information contained and features, so as to obtain higher similarity calculation results, which are generally deviated from the results of human evaluation.However, in the trend that the calculation process of semantic similarity is more and more close to simulating human thinking process, it becomes very important to consider the differences between semantics.In view of this, this paper proposes a method to add the difference between semantics to the calculation of semantic similarity.This method deeply excavates the antisense relationship between concepts through the unique hierarchical structure of WordNet corpus, and then uses four different strategies to combine the difference represented by the antisense relationship.By reproducing the existing methods and combining them with antisense factors, the final semantic similarity results are obtained.Aiming at the proposed semantic similarity calculation model based on dissimilarity, the parameters and correlation coefficients of the model are further analyzed and discussed in this paper.The experimental results show that the proposed model has higher correlation coefficient with human judgment results than other methods, and the model can improve the accuracy of the existing semantic similarity calculation methods based on path distance.

关 键 词:语义相似 WORDNET 相异性 反义关系 

分 类 号:TP311.5[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象