检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]蚌埠医学院卫生管理系,安徽蚌埠233030 [2]中国科学技术大学计算机科学与技术学院,合肥230027 [3]安徽工业大学计算机科学与技术学院,安徽马鞍山243032
出 处:《小型微型计算机系统》2017年第4期738-743,共6页Journal of Chinese Computer Systems
基 金:国家自然科学基金项目(61402009)资助;安徽省高校人文科学重点项目(sk2015A405)资助
摘 要:网络新闻焦点识别及演化跟踪对新闻检索、新闻推荐和舆情分析等起着非常重要的作用.当前的新闻焦点识别方法存在着焦点识别不清、演化跟踪偏斜以及不能捕获焦点报道的强度分布等问题.通过深入分析新闻报道的特点及LDA(Latent Dirichlet Allocation)主题模型原理,把报道文档发布的时间信息引入LDA模型中,提出一种基于焦点和时间联合建模的新闻焦点演化跟踪方法 DST-LDA(Dynamic Subtopic and Time based Topic Model).该模型避免了以往跟踪算法严重依赖时间分割的局限性,能够产生文档-焦点θ、焦点-词汇φ及焦点-时间π三个分布矩阵,通过选择新闻焦点的特征词和特征时间,高效地分类出新闻焦点并识别出各焦点持续的时间分布及报道力度.本文在4个新闻数据集上分别对DST-LDA算法进行了实验验证,并与其它主流算法进行了对比.实验证明:本文算法在新闻焦点演化跟踪方面达到了良好效果.Automatic subtopics detection and tracking for a given news topic, as one of natural language processsing technologies, is very important for news retrieval,recommendation and public opinion analysis. In this paper, by analyzing the features of news topic and the principle of Latent Dirichlet Allocation ( LDA ), a novel Dynamic Subtopic and Time based Topic Model ( DST-LDA ) is pro- posed. The proposed approach incorporates the publication date of document with LDA model, enables this model to have the ability to jointly model report text and publication date in order to capture the underlying subtopics and their duration. Through implementing this model, the document-subtopic, subtopic-word and subtopic-date distributions will be obtained, from which we can not only capture the news subtopics and identify their dynamic processes efficiently and conveniently, but also understand their report strength. Finally, this model is tested on 4 real intemet news data sets, which are provided by Tencent News and Sina News. The experimental results demon- strate that the DST-LDA model has better performance than the state-of-the-art approaches.
关 键 词:新闻焦点 LDA 新闻演化 焦点识别 DST-LDA 焦点跟踪
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.28