检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张俊三[1,2] 瞿有利[1] 税仪冬[1] 田盛丰[1]
机构地区:[1]北京交通大学计算机与信息技术学院,北京100044 [2]中国石油大学计算机与通信工程学院,山东青岛266580
出 处:《计算机研究与发展》2014年第6期1359-1372,共14页Journal of Computer Research and Development
基 金:中央高校基本科研业务费专项科研基金项目(2011JBM231)
摘 要:针对相关实体发现中基于Wikipedia的实体排序存在的问题:半自动的目标类型获取、粗粒度的目标类型、实体类型相关度二值判断、实体关系相关度计算未考虑停止词作用.设计了一个实体排序框架,从实体相关度、实体类型相关度和实体关系相关度3方面的组合计算来对实体进行排序,通过对比多种组合方法获取了最优的方法.提出了一种新的实体类型相关度计算方法,该方法可以自动获取细粒度的目标实体类型,并通过归纳学习获取其下义Wikipedia类别判别规则集合,通过统计候选实体类别信息中符合目标类型下义类别判别规则的类别数来计算实体类型相关度.提出了一种"去停止词重构关系"方法计算候选实体和源实体的关系相关度.实验表明提出的方法可以有效地提高实体排序效果并且降低计算时间耗费.Entity ranking is a very important step for related entity finding (REF).Although researchers have done many works about "entity ranking based on Wikipedia for REF",there still exists some issues:the semi-automatic acquirement of target-type,the coarse-grained target-type,the binary judgment of entity-type relevancy and ignoring the effects of stop words in calculation of entityrelation relevancy.This paper designs a framework,which ranks entities through the calculation of a triple-combination (including entity relevancy,entity-type relevancy and entity-relation relevancy) and acquires the best combination-method through the comparisons of experimental results.A novel approach is proposed to calculate the entity-type relevancy.It can automatically acquire the finegrained target-type and the discriminative rules of its hyponym Wikipedia-categories through inductive learning,and calculate entity-type relevancy through counting the number of categories which meet the discriminative rules.Also,this paper proposes a "cut stop words to rebuild relation"approach to calculate the entity-relation relevancy between candidate entity and source entity.Experiment results demonstrate that the proposed approaches can effectively improve the entity-ranking results and reduce the time consumed in calculating.
关 键 词:相关实体发现 实体排序 实体类型相关度 实体关系相关度 WIKIPEDIA
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.30