检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:刘业政[1] 杜亚楠[1] 姜元春[1] 杜非[1]
出 处:《模式识别与人工智能》2015年第1期27-34,共8页Pattern Recognition and Artificial Intelligence
基 金:973国家重点基础研究发展计划项目(No.2013CB329603);国家自然科学基金项目(No.71071047);教育部人文社科基金项目(No.12YJC630073)资助
摘 要:及时掌握大众关心的热点话题是企业进行商业创新和商务营销的重要前提.现有方法大都依赖于非结构化数据的处理或反复遍历样本集,使算法复杂性较高.文中从话题的统计特性出发,提出建立在结构化数据上的非参数方法.首先对单个话题构建表征话题传播扩散程度和关注聚焦程度的热度曲线;然后对这些形态丰富的热度曲线进行分类建模,得到不同类别曲线的共性特征及发展规律;最后使用分类模型上的加权投票规则预测新话题是否会发展成为热门话题.基于新浪微博平台进行数据收集和实验,结果表明该方法数据结构简单、效果良好、复杂度低且易于控制.Timely acquiring of hot topics is of great significance for commercial innovation and business marketing. Existing methods mostly need to cope with non-structured data or repeated traversal sample set, which results in high complexity. In this paper, emphasizing the topic statistical properties, a non-parameter method based on structured data is proposed to acquire the hot topics in time. Firstly, diffusion degree and focus degree are introduced to build heat curves to characterize the topics. Then, the varied heat curves are classified to determine the common behaviors of the topics. Finally, the weighted-vote scheme is employed to predict whether a topic is trend or not. The experimental results on Sina microblog show that the proposed method has simple data structure and works well with low time complexity and simple manipulation.
分 类 号:TP393.092[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15