改进的多项朴素贝叶斯分类算法和Python实现  被引量:8

Improved Multinomial Naive Bayes Classification Algorithm and Python Implementation

在线阅读下载全文

作  者:陈翠娟 CHEN Cui-juan(Fuzhou Institute of Technology,Fuzhou 350506,Fujian Province,China)

机构地区:[1]福州理工学院,福州350506

出  处:《景德镇学院学报》2021年第3期92-95,共4页Journal of JingDeZhen University

基  金:福州理工学院科研基金资助项目(FTKY004)。

摘  要:朴素贝叶斯算法是一种基于概率的分类算法,它利用先验概率的值来计算后验概率,是一种比较简单有效的分类算法,广泛地应用于各种文本分类领域,能够在海量数据中高效、准确地区分消极和积极评论。朴素贝叶斯算法中的多项朴素贝叶斯模型尤其适合描述出现次数的特征,常用于文本分类。本文利用多项朴素贝叶斯算法实现对商家评论的分类,并在传统的多项朴素贝叶斯算法的基础上,增加TF-IDF特征加权算法,同时做拉普劳斯平滑处理。利用Python语言实现算法,对多个商家的上万条评论进行分类测试,实验结果表明,改进的多项朴素贝叶斯分类算法,预测准确率高且算法稳定、预测速度快。Naive Bayes algorithm is a probability based classification algorithm,which uses the value of prior probability to calculate posterior probability.It is a relatively simple and effective classification algorithm,which is widely used in various text classification fields.It can distinguish negative and positive comments efficiently and accurately in massive data.The Multinomial Naive Bayes model in Naive Bayes algorithm is especially suitable for describing the characteristics of the number of occurrences,which is commonly used in text classification.Based on the traditional Multinomial Naive Bayes algorithm,the TF IDF feature weighting algorithm is added to realize the classification of merchant comments.By using the Python language to implement the algorithm,tens of thousands of comments of multiple merchants are classified and tested.The experimental results show that the improved naive Bayes comment classification algorithm has high prediction accuracy,stable algorithm and fast prediction speed.

关 键 词:朴素贝叶斯 TF-IDF 拉普拉斯平滑 文本分类 PYTHON 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象