检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:蒋宗礼 赵思露 JIANG Zong-li;ZHAO Si-lu(Faculty of Information Technology,Beijing University of Technology,Beijing 100124,China)
出 处:《软件导刊》2018年第12期177-181,共5页Software Guide
摘 要:检索结果聚类能够有效帮助提高获取信息的效率和质量。针对传统文本聚类模型存在数据维数过高、缺乏语义理解等问题,提出一种面向检索结果聚类的融合共现分析主题建模算法。基于改进的LDA模型,对得到的"文档-主题"概率分布进行聚类分析,采用K-means算法完成聚类过程,最后提出根据聚类中心提取主题词作为类簇标签。实验结果表明,改进的LDA算法在检索结果聚类应用上不仅获得了很好的聚类效果,类簇标签也有良好的可读性。The clustering of search results can effectively help improve the efficiency and quality of information retrieval.Aiming at the problems of traditional data clustering models such as high data dimension and lack of semantic understanding,this paper proposes a fusion co-occurrence analysis topic modeling algorithm oriented to the retrieval of results clustering.Based on the improved LDA model,the obtained“document-subject”probability distribution is clustered,the K-means algorithm is used to complete the clustering process,and finally the clustering center is used to extract topic words as cluster-like tags.The experimental results show that the improved LDA algorithm not only has a good clustering effect on the clustering of search results,but also has a good readability of cluster labels.
分 类 号:TP319[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49