EMTM:微博中与主题相关的专家挖掘方法  被引量:5

EMTM:A Method for Experts Mining in Micro-Blog with Topic-Level

在线阅读下载全文

作  者:张腊梅[1,2,3,4] 黄威靖[3,4] 陈薇[3,4] 王腾蛟[3,4] 雷凯[1,2] 

机构地区:[1]深圳市云计算关键技术与应用重点实验室(北京大学),广东深圳518055 [2]北京大学信息工程学院,广东深圳518055 [3]高可信软件技术教育部重点实验室(北京大学),北京100871 [4]北京大学信息科学技术学院,北京100871

出  处:《计算机研究与发展》2015年第11期2517-2526,共10页Journal of Computer Research and Development

基  金:国家"八六三"高技术研究发展计划基金项目(2012AA011002);国家自然科学基金项目(61300003);教育部高等学校博士学科点专项科研基金项目(20130001120001)

摘  要:目前,微博已成为人们获取信息、分享信息的最流行平台之一.经过长期的发展积累,微博中聚集了很多具有权威专业知识背景的专家,挖掘微博中与主题相关的专家有利于进一步地用户推荐、微博舆情分析等工作.在微博中,与某个主题相关的专家是指因具有可靠的与此主题相关的专业知识或技能而在此主题下具有高影响力的用户.挖掘高影响力的用户可以通过分析微博的转发数据来进行,然而由于微博中用户的转发行为分为"主题相关转发"和"跟随转发"2种,因此,因被转发概率高而具有高影响力的用户不一定是专家.EMTM(experts mining topic model)是一种基于主题模型的概率生成模型,通过区分微博用户的不同转发行为来挖掘微博中与主题相关的专家.模型采用Gibbs采样进行推理求解.在真实的新浪微博数据集上的对比实验表明EMTM能够有效地挖掘微博中与主题相关的专家.So far,micro-blog has been one of the most popular platforms for people to access and share information.After long-term development,there are many experts with authoritative professional background knowledge.Mining experts in topic-level will contribute to the user recommendation and public opinion analysis in micro-blog.In micro-blog,experts in a topic are the users who have high influence on the topic,since they have authoritative professional knowledge and skills about the topic.High influence is a necessary condition for experts.Influence analysis belongs to subjective problems and need to be quantified objectively.In micro-blog,the probability of being retweeted is one of the most important indexes to measure the influence of users.So we can find out the high influencers by analyzing the retweet data.But,there are two kinds of retweet behaviors for the users in micro-blog:"topic-sensitive retweet"and "following retweet".Therefore,the users who have high influence because of being retweeted with high probability are not always experts.In this paper,we propose a probability generation model EMTM(experts mining topic model)which can find out the experts in topic-level by distinguishing two kinds of the retweet behaviors.We use Gibbs sampling for model inference.Our experiments on real Sina Weibo data show that our model EMTM is effective in mining experts in topic-level.

关 键 词:专家 主题 微博 转发行为 概率模型 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象