基于贝叶斯算法的垃圾邮件过滤技术研究  

Research on Spam Filtering Technology Based on Bayesian Algorithm

在线阅读下载全文

作  者:顾玮[1] 

机构地区:[1]徐州高等师范学校,江苏徐州221116

出  处:《办公自动化》2018年第1期55-57,共3页Office Informatization

摘  要:分析了垃圾邮件内容过滤技术,认识到垃圾邮件过滤技术与普通的文本分类和挖掘问题存在着很多不同。从邮件结构不同于普通文本出发,对基于贝叶斯的过滤方法进行了一系列改进,提出一种阈值调整算法,设计了集成加权模型,以充分利用邮件的结构信息。基于集成加权模型对邮件头和邮件正文分别建立模型,最后通过加权方法集成二者结果,对垃圾邮件进行过滤。通过在改进和扩展而设计的贝叶斯过滤器在最新的标准数据集上的测试结果表明,与经典的贝叶斯过滤器Bogo相比,过滤效果有较大的提高。This paper analyzes the content filtering technology of spam and realizes that there are many differences be-tween spam filtering technology and common text classification and based on bayes is improved, and the integrated weighted model is mining problems. In this paper, the filtering method designed to make full use of the structure information of the mail. Based on the integrated weighted model, a model was established for the mail head and the body of the mail, and finally, the results were integrated with the weighted method to filter the spare. By designed to improve and expand the bayesian filter in the latest standard data sets on the test results show that compared with the classical bayesian filter Bogo, filtering effect has great improvement.

关 键 词:集成加权贝叶斯 最小风险贝叶斯 主动学习贝叶斯 特征选择 阈值调整 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象