检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:陈翠娟 CHEN Cui-juan(Fuzhou Institute of Technology,Fuzhou 350506,Fujian Province,China)
机构地区:[1]福州理工学院,福州350506
出 处:《景德镇学院学报》2021年第3期92-95,共4页Journal of JingDeZhen University
基 金:福州理工学院科研基金资助项目(FTKY004)。
摘 要:朴素贝叶斯算法是一种基于概率的分类算法,它利用先验概率的值来计算后验概率,是一种比较简单有效的分类算法,广泛地应用于各种文本分类领域,能够在海量数据中高效、准确地区分消极和积极评论。朴素贝叶斯算法中的多项朴素贝叶斯模型尤其适合描述出现次数的特征,常用于文本分类。本文利用多项朴素贝叶斯算法实现对商家评论的分类,并在传统的多项朴素贝叶斯算法的基础上,增加TF-IDF特征加权算法,同时做拉普劳斯平滑处理。利用Python语言实现算法,对多个商家的上万条评论进行分类测试,实验结果表明,改进的多项朴素贝叶斯分类算法,预测准确率高且算法稳定、预测速度快。Naive Bayes algorithm is a probability based classification algorithm,which uses the value of prior probability to calculate posterior probability.It is a relatively simple and effective classification algorithm,which is widely used in various text classification fields.It can distinguish negative and positive comments efficiently and accurately in massive data.The Multinomial Naive Bayes model in Naive Bayes algorithm is especially suitable for describing the characteristics of the number of occurrences,which is commonly used in text classification.Based on the traditional Multinomial Naive Bayes algorithm,the TF IDF feature weighting algorithm is added to realize the classification of merchant comments.By using the Python language to implement the algorithm,tens of thousands of comments of multiple merchants are classified and tested.The experimental results show that the improved naive Bayes comment classification algorithm has high prediction accuracy,stable algorithm and fast prediction speed.
关 键 词:朴素贝叶斯 TF-IDF 拉普拉斯平滑 文本分类 PYTHON
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222