检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张亚男 冯建文[1] ZHANG Yanan;FENG Jianwen(School of Computer,Hangzhou Dianzi University,Hangzhou Zhejiang 310018,China)
机构地区:[1]杭州电子科技大学计算机学院,浙江杭州310018
出 处:《杭州电子科技大学学报(自然科学版)》2018年第1期59-64,80,共7页Journal of Hangzhou Dianzi University:Natural Sciences
摘 要:针对划分聚类算法对初始中心较为敏感的缺陷,提出了一种新的热点话题检测方法。首先,为了降低语义表达形式带来的误差,采用结合语义相似度的TF-IDF函数计算特征权重;然后,用AGNES算法进行聚类,得到初始聚类中心,再用K-means算法聚类出最终结果;最后,分析微博的转发数和评论数对热度的影响,计算话题热度并对结果进行排序。通过实验验证了新方法的有效性。To solve the partition clustering algorithm is sensitive to the initial center more defects,a new hot topic detection method was proposed in this paper.Firstly,in order to reduce the error caused by semantic expression,the feature weight was calculated by the term frequency-inverse document frequency(TF-IDF)function combined with semantic similarity.Then,combined with AGNES clustering algorithm to select the initial clustering center,and applied to the K-means algorithm for clustering.Finally,in order to sort of topics,this paper analysis the micro-blog forwarding and comments on the heat influence and calculates the heat of topics.The results show that proposed method can effectively topic clustering and detect the hot topics.
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222