基于K近邻的新话题热度预测算法  被引量:31

Algorithm for Prediction of New Topic's Hotness Using the K-nearest Neighbors

在线阅读下载全文

作  者:聂恩伦[1] 陈黎[1] 王亚强[1] 秦湘清[1] 金宇[1] 于中华[1] 

机构地区:[1]四川大学计算机学院,成都610065

出  处:《计算机科学》2012年第B06期257-260,共4页Computer Science

基  金:高等学校博士学科点专项科研基金(20100181120029)资助

摘  要:随着互联网的快速发展,网络舆情成为政府部门和企业以及社会大众关注的焦点,对网络舆情进行有效监管和正确引导是当前亟待解决的问题,话题热度预测是舆情监管和引导的基础。针对现有算法无法对新话题的热度进行有效预测的缺点,提出了一种基于K近邻的新话题热度预测算法。该算法利用与新话题相似的历史话题的点击数时间序列来对新话题的热度进行预测。实验结果表明,在允许相对误差分别低于10%、20%和30%的情况下,算法预测的前3天点击数的平均正确率分别为47.26%、61%和67.7%,点击数变化趋势平均正确率达到73.73%,这也说明了相似的话题在话题出现的初期具有近似的热度变化趋势。With the rapid development of the Internet,the government,enterprises and public have paid more and more attentions on net-mediated public sentiment.How to effectively monitor and aright guide the public sentiment on the Internet has become an issue that should be coped urgently with.As a basis to solving the issue,it is necessary to have ability of predicting topic’s hotness appearing on the Internet.As traditional algorithms could not predict aright new to-pic’s hotness,a novel algorithm based on K-nearest neighbors(K-NN) was proposed in this paper.The algorithm predicts the hotness of new topics by using hotness times-series of their historical similar topics.The experimental results show that the average accuracies of the hotness prediction during the first 3 days are 47.26%,61% and 67.7% respectively with the corresponding relative errors being less than 10%,20% and 30%,and the average accuracy of the hotness trends within the first 3 days could be up to 73.73%.Meanwhile,the results also demonstrate that similar topics approximately have same hotness trends in their early developing stages.

关 键 词:热度预测 新话题 K-近邻算法 话题相似性 网络舆情 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象