基于主题情感联合分析的游客画像研究  被引量:1

Research on Tourist Portrait Based on Joint Topic-Sentiment Analysis

在线阅读下载全文

作  者:李琴 李少波[2] 胡杰 LI Qin;LI Shaobo;HU Jie(College of Big Data Statistics,Guizhou University of Finance and Economics,Guiyang,550000,China;School of Mechanical Engineering,Guizhou University,Guiyang,550000,China)

机构地区:[1]贵州财经大学大数据统计学院,贵阳550000 [2]贵州大学机械工程学院,贵阳550000

出  处:《计算机工程》2022年第6期278-287,294,共11页Computer Engineering

基  金:贵州省科技计划项目(黔科合基础-ZK[2021]337);贵州省教育厅青年科技人才成长项目(黔教合KY字[2021]141);贵州财经大学引进人才科研启动项目(2021YJ003)。

摘  要:网络文本作为现代游客承载感知和表达观点的载体,已成为游客画像构建与分析的重要数据来源。现有的自然语言处理技术在游客画像的挖掘过程中主要关注游客的需求和情感,缺少技术与旅游应用的有效衔接,然而现有的文本挖掘技术中文本的主题和情感通常被割裂分析,缺乏相互指向性,无法有效提取用户细粒度的意见。提出一种基于变分自编码的有监督主题情感联合分析模型。将词频权重引入到先验知识中,同时通过截断高斯模型构造变参数,有效捕获离散数据中的相关性,利用情感标签辅助主题的训练和生成,以提升主题挖掘及情感预测的准确率。通过变分自编码模型计算贝叶斯主题模型的后验分布,采用主题分布下的情感分类预测实现主题情感的联合分析。实验结果表明,当主题数为10~100时,该模型的情感预测平均准确率约为85%,相比LDA、SAGE、NVDM模型,能够有效挖掘酒店用户评论的特征。As the carrier of modern tourists’ perception and expression of views,network text has become an important data source for the construction and analysis of tourist portrait.The existing natural language processing technology focuses on the needs and emotions of tourist portraits,and lacks an effective connection between technology and tourism applications.However,in the existing text mining technology,the topic and sentiment of text are usually separated and analyzed,show a lack of mutual directivity,and cannot effectively extract users’ fine-grained opinions.A supervised joint topic-sentiment analysis model based on Variational Auto-Encoders(AVEs),is proposed.The word frequency weight is introduced into the prior knowledge,and the variable parameters are constructed by Gaussian StickBreaking model to effectively capture the correlation in the discrete data.The sentiment label is used to assist the topic training and generation,to improve the accuracy of topic mining and emotion prediction.The posterior distribution of the Bayesian topic model is calculated using the AVEs model,and the sentiment classification prediction under topic distribution is used to realize the joint topic-sentiment analysis.The experimental results show that the average accuracy of this model is about 85% when the number of topics is 10~100.Compared with LDA,SAGE and NVDM models,this model can effectively mine the characteristics of hotel user comments.

关 键 词:游客画像 变分自编码器 主题情感联合分析 意见挖掘 隐含狄利克雷分布模型 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象