SLTA-PathSim:一种融合节点属性和文本信息的相似性度量算法  被引量:6

SLTA-PathSim: a Similarity Measure Algorithm Combining Node Attributes and Text Information

在线阅读下载全文

作  者:刘辉林 闫娜 罗梦莹 LIU Hui-lin;YAN Na;LUO Meng-ying(School of Computer Science&Engineering,Northeastern University,Shenyang 110169,China)

机构地区:[1]东北大学计算机科学与工程学院,沈阳110169

出  处:《小型微型计算机系统》2020年第3期485-490,共6页Journal of Chinese Computer Systems

基  金:国家自然科学基金面上项目(61672144)资助.

摘  要:作为融合大规模信息的有效工具,异质信息网络在数据挖掘任务中一直具有重要的实用意义.文献信息网络作为一种典型的异质信息网络,基于其的作者相似性度量问题近年来得到了广泛的关注.尽管Path Sim算法在解决该问题上取得了很好的效果,但该方法仅仅关注网络结构和元路径下的语义信息,忽略了节点属性和文本信息等因素的影响.在Path Sim算法的基础上,本文提出了SLTA-PathSim算法,该算法包括基于节点属性的SL-PathSim(Signature Location-PathSim)算法和基于文本信息的TA-PathSim(Title and Abstract-PathSim)算法.通过在AMnier数据集上查找与指定作者相似的Top-k作者,验证了SLTAPath Sim的有效性.As a valid tool to integrate large-scale information,heterogeneous information networks have always been of vital pragmatic importance for data mining tasks. The bibliographic information network is a typical heterogeneous information network and the issue of author similarity measuring based on it has received considerable critical attention in recent years. Although the PathSim has achieved good results in solving this problem,the study only focused on the network structure and semantic meanings under meta-paths,ruling out the influence of other factors such as node attributes and text information. Based on the PathSim,this paper designs the SLTAPathSim,including SL-PathSim( Signature Location-PathSim) and TA-PathSim(Title and Abstract-PathSim) respectively based on node attributes and text information. The validity of SLTA-PathSim is verified by searching Top-k authors similar to the given authors on AMnier dataset.

关 键 词:文献信息网络 相似性度量 元路径 网络嵌入 

分 类 号:TP301[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象