检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:陈彦桦[1] 李剑[1] CHEN Yanhua;LI Jian(School of Computer Science and Technology,Beijing University of Posts and Telecommunications,Beijing 100876,China)
出 处:《计算机工程》2018年第11期197-201,208,共6页Computer Engineering
基 金:国家自然科学基金(U1636106;61472048)
摘 要:为高效计算树的相似度,提出基于树结构特征的相似度计算方法。通过构造K个节点的所有非同构形态子树,计算其同构个数并作为特征向量进行树的相似度计算。该方法摒弃了直接计算相似度的方式,利用树的结构特征间接表示树的相似度,可有效应用于大规模数据集的相似度计算。实验结果显示:在特征向量提取方面,随着树的节点规模增大,算法时间复杂度呈线性增加;在相似度计算方面,同类数据相似度0. 7以上占比74%,不同类数据相似度不超过0. 2,表明提取的特征向量能够较好地表征原程序。In order to efficiently compute the similarity of tree,a method of computing the tree similarity based on structure feature is proposed.Firstly,all non-isomorphic sub-trees of K nodes are constructed,then the number of the isomorphic sub-tree from tree is calculated by using sub-trees constructed before.These numbers will be used to compute the similarity of tree as the feature vector.In this paper,the method of directly calculating similarity is abandoned,and the similarity of trees is indirectly represented by the structure features of trees.It can be applied to the similarity computation of large-scale datasets.Experimental results show that,in feature vectors extracting,with the size of the tree nodes increases,the time complexity of the proposed algorithm increases linearly;in similarity computing,the similarity of the same category higher than 0.7 are accounted for 74%,while the different categories do not exceed 0.2,which indicates that the extracted feature vectors are able to characterize the original application.
分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.171