检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]天津商业大学理学院,天津 [2]天津商业大学信息工程学院,天津
出 处:《应用数学进展》2023年第3期1090-1099,共10页Advances in Applied Mathematics
摘 要:本文立足于针对文本的情感分析,以Yelp数据集为例进行评估。Yelp评论的评级预测可以通过多种方式进行,如情绪分析和五星评级分类。在本文中,我们将基于评论文本对餐馆的评级进行预测。在分析了原始数据分布之后,首先创建了一个平衡的训练子数据集,后分割数据集、提取特征,同时应用朴素贝叶斯和Logistic回归两种机器学习方法和基于transformer的BERT、DistilBERT和RoBERTa三种深度学习模型进行评估比较。从训练时间和训练效果两个方面给出结果,为读者提供实际的选择依据。Based on the emotional analysis of the text, this paper takes Yelp data set as an example to evaluate it. The rating prediction of Yelp reviews can be made in many ways, such as sentiment analysis and five-star rating classification. In this paper, we will predict the rating of restaurants based on the review text. After analyzing the distribution of the original data, a balanced training sub-data set is first created, then the data set is segmented and features are extracted. At the same time, two ma-chine learning methods, naive Bayes and Logistic regression, and three deep learning models based on transformer, BERT, DistilBERT and RoBERTa, are applied to evaluate and compare. The results are given from two aspects: training time and training effect, which provides practical basis for readers to choose.
关 键 词:朴素贝叶斯 LOGISTIC回归 BERT DistilBERT RoBERTa
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222