检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:肖宝[1] 李璞[2,3] 胡娇娇[2] 蒋运承[2]
机构地区:[1]钦州学院电子与信息工程学院,广西钦州535000 [2]华南师范大学计算机学院,广州510631 [3]郑州轻工业学院软件学院,郑州450000
出 处:《计算机工程》2017年第6期182-188,194,共8页Computer Engineering
基 金:国家自然科学基金(61272066);广西高校中青年教师基础能力提升项目(KY2016LX431);广州市科技计划项目(2014J4100031);钦州市科学研究与技术开发计划项目(20164407)
摘 要:微博文本短小、特征稀疏、与用户查询之间存在语义鸿沟的特点会降低语义检索效率。针对该问题,结合文本特征和知识库语义,构建基于潜在语义与图结构的语义检索模型。通过Tversky算法计算基于Hashtag的特征相关度;利用隐含狄利克雷分布算法对Wikipedia语料库训练主题模型,基于JSD距离计算映射到该模型的文本主题相关度;抽取DBpedia中实体及其网络关系连接图,使用SimRank算法计算图中实体间的相关度。综合以上3个结果得到最终相关度。通过短文本和长文本检索对Twitter子集进行实验,结果表明,与基于开放关联数据和图论的方法相比,该模型在评估指标MAP,P@30,R-Prec上分别提高了2.98%,6.40%,5.16%,具有较好的检索性能。The characteristics of microblog such as short text, sparse feature and the semantic gap between users' query may reduce semantic retrieval efficiency. Aiming at these problems, taking into account both text feature and semantic of knowledge base,a semantic retrieval model based on latent semantics and graph structure is proposed. Firstly, Tversky algorithm is employed to measure feature relatedness by taking Hashtag as feature;Secondly,a topic model is trained by Latent Dirichlet Allocation(LDA) for Wikipedia, and text topic relatedness mapped to this model is calculated by JSD; Finally,the connection graph of entity and its network relation are extracted in DBpedia. SimRank is employed to measure relatedness between two entities. The three types of relatednesses calculated in previous steps are used to compute a final relatedness. Twitter subsets for short and long queries are used in experiment. Experimental results show that, compared with the method based on linked open data and graph-based theory, the proposed model improves MAP,P@ 30,R-Prec by 2.98% ,6.40% ,5.16% respectively,which means that it has better retrieval perfermance.
关 键 词:微博 文本相关度 图结构 隐含狄利克雷分布 语义检索
分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.42