面向社交媒体评论的子话题挖掘研究  被引量:5

Subtopic Mining Research Based on Social Media Reviews

在线阅读下载全文

作  者:夏丽华 韩冬梅[1,2] Xia Lihua;Han Dongmei(School of Information Management and Engineering, Shanghai University of Finance and Economics, Shanghai 200433;Shanghai Financial Information Technology Key Research Laboratory, Shanghai 200433)

机构地区:[1]上海财经大学信息管理与工程学院,上海200433 [2]上海市金融信息技术研究重点实验室,上海200433

出  处:《情报杂志》2020年第4期110-116,共7页Journal of Intelligence

基  金:全国教育科学规划教育部重点课题“新媒体环境对大学生学业情绪的影响及教学策略研究”(编号:DIA170369)研究成果之一。

摘  要:[目的/意义]在线用户在社交网络分享产品的体验,即便是同种产品的评论,往往包含不同的子话题(产品的不同方面)。面向在线评论的子话题挖掘能够分析参与者对产品的不同方面的关注及需求,为管理者提供更多的决策支持。[方法/过程]现有话题挖掘多采用分类、聚类、概率主题模型的方法,由于描述同一产品的文档往往十分相似,现有方法难以保证子话题的差异性。为此,将概率主题模型融合词共现关系,提出GPLSA方法,包括PLSA算法初步识别子话题、去除公共背景词、合并相似的子话题及更新子话题关键词等步骤。[结果/结论]知乎网站MOOCs数据集上的实验结果表明,GPLSA方法的主题凝聚性高于现有算法,能够有效提高子话题发现的质量。结合MOOCs子话题反馈的学习者需求,给出完善MOOCs管理的有效建议。[Purpose/Significance]Online users share product experiences on social networks,and reviews of the same product often contain different subtopics(different aspect of the product).Subtopic mining for online reviews can analyze participants'concerns and their needs on different aspects of products,and provide more decision support for managers.[Method/Process]Existing methods for discovering topics are commonly based on classification,clustering and probabilistic topic model.However,as the documents describing the same product are often very similar,it is difficult for existing methods to ensure the diversity of subtopics.To tackle this problem,this paper proposes GPLSA method to integrate probabilistic topic model and word co-occurrence relationship,which includes subtopics detection by PLSA algorithm,removal of common background words,merging similar subtopics and updating subtopic keywords.[Result/Conclusion]The experiments on MOOCs dataset from zhihu website can prove that GPLSA has higher Topic Cohesion than the existing methods,and effectively improve the quality of subtopic discovery.According to the learners'needs for MOOCs subtopics feedback,effective suggestions for improving MOOCs management are given.

关 键 词:社交媒体 在线评论 话题识别 PLSA 词共现 

分 类 号:G250.73[文化科学—图书馆学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象