检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:邱庆羽 李婧 全兵 童超 张利君 张海仙[1] QIU Qingyu;LI Jing;QUAN Bing;TONG Chao;ZHANG Lijun;ZHANG Haixian(School of Computer Science,Sichuan University,Chengdu Sichuan 610065,China;China Mobile(Suzhou)Software Technology Company Limited,Suzhou Jiangsu 215000,China;Chengdu Ruibeiyingte Information Technology Company Limited,Chengdu Sichuan 610041,China)
机构地区:[1]四川大学计算机学院,成都610065 [2]中移(苏州)软件技术有限公司,江苏苏州215000 [3]成都瑞贝英特信息技术有限公司,成都610041
出 处:《计算机应用》2018年第5期1327-1333,1352,共8页journal of Computer Applications
基 金:教育部-中国移动科研基金资助项目(MCM20160307);四川省科技创新苗子工程项目;成都市科技局国际合作项目(2016-GH02-00048-HZ;2015-GH02-00041-HZ)~~
摘 要:文献信息网络是典型的异构信息网络,基于其进行相似性搜索是图挖掘领域的一个研究热点。然而,现有的方法主要采用元路径或元结构的方式,并未考虑节点自身的语义特征,从而导致搜索结果出现偏差。对此,基于文献信息网络提出了一种基于向量的语义特征提取方法,并设计实现了基于向量的节点相似性计算方法 VSim;此外,结合元路径设计了基于语义特征的相似性搜索算法VPSim;为提高算法的执行效率,针对文献网络数据的特点,设计了剪枝策略。通过在真实数据上的实验,验证了VSim对搜索语义特征相似实体的适用性,以及VPSim算法的有效性、高执行效率和高可扩展性。Bibliography information network is a typical heterogeneous information network and the similarity search based on it is a hot topic of graph mining. However, current methods mainly adopt meta path or meta structure to search similar objects, do not consider semantic features of node itself which leads to a deviation in the search results. To fill this gap, a vector-based semantic feature extraction method was proposed, and a vector-based node similarity calculation method called VSim was designed and implemented. In addition, a similarity search algorithm VPSim( Similarity computation Based on Vector and meta Path) based on semantic features was designed by combining the meta-paths. In order to improve the execution efficiency of the algorithm, a pruning strategy based on the characteristics of bibliographic network data was designed. Experiments on real-world data sets demonstrate that VSim is applicative for searching entities with similar semantic features and VPSim is effective, efficient and extensible.
关 键 词:文献信息网络 相似性搜索 图挖掘 元路径 语义特征
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.65