检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李明珠 米传民[1] 肖琳[1] 许乃元 LI Ming-Zhu;MI Chuan-Min;XIAO Lin;XU Nai-Yuan(College of Economics and Management,Nanjing University of Aeronautics and Astronautics,Nanjing 211106,China)
机构地区:[1]南京航空航天大学经济与管理学院,南京211106
出 处:《计算机系统应用》2022年第6期315-323,共9页Computer Systems & Applications
基 金:国家自然科学基金(72001106)。
摘 要:随着网络剧近年来的飞速发展,对播放量的研究也逐渐受到关注.网络剧播放量反映了网络剧的口碑和受欢迎程度,这与制作方和投资方的收益密切相关.但目前的研究尚未考虑观众评论的情感态度对播放量的影响,并且预测模型也较为简单,预测精度有待进一步提高.本文在对用户评论进行情感分析的基础上,构建Stacking集成学习模型对我国网络剧的播放量进行预测.首先基于SO-PMI算法构建网络剧领域情感词典,并结合基础情感词典以及点赞数权重计算出评论情感得分,加入预测指标体系中;接着以随机森林(random forest, RF), GBDT, XGBoost以及LightGBM为基学习器, MLR为元学习器,构建Stacking网络剧播放量分阶段的预测模型,使用当前数据对下一周的播放量进行预测;最后进行模型比较分析,并得出预测变量的重要性分值.实验结果显示,本文所构建的模型判定系数R方值达到了0.89,高于基学习器单独的模型预测R方值(最高0.84)以及未加入情感得分变量的Stacking模型预测R方值(0.81).可以得出加入情感得分变量后,本文构建的Stacking集成学习模型在一定程度上可以提高网络剧播放量的预测精度.With the rapid development of network dramas in recent years, the research on broadcast volume has gradually attracted attention. Broadcast volume reflects the reputation and popularity of a network drama, which are closely related to the profits of producers and investors. However, current research rarely considers the impact of the sentiments in viewers’ comments on broadcast volume, and the forecasting models are simple. Consequently, the accuracy of prediction needs to be further improved. After a sentiment analysis of users’ comments, we construct a stacking ensemble learning model to predict the broadcast volume of network dramas in China. Using the SO-PMI(semantic orientation-pointwise mutual information) algorithm, we build a sentiment dictionary in the network drama domain. A basic sentiment dictionary and the number of likes are also taken into account to calculate the comment sentiment scores, which are then added into the prediction index system. With random forest(RF), GBDT(gradient boosting decision tree), XGBoost(extreme gradient boosting), and LightGBM(light gradient boosting machine) as base learners and MLR as a meta learner, a stacking prediction model is constructed to predict the broadcast volume of a network drama in stages. The broadcast volume of the next week can be forecasted with data of the current week. Finally, the results of different models are compared and analyzed, and the importance scores of predictive variables are obtained. The experimental results show that the determination coefficient R-square of the proposed model reaches 0.89, which is higher than that of a single base learner(maximum 0.84) as well as that of the stacking model without sentiment score variables(0.81). It can be concluded that with sentiment score variables, the proposed stacking ensemble learning model delivers better prediction accuracy on the broadcast volume of network dramas than that of traditional models.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.28