数据挖掘技术在垃圾邮件检测中的应用  被引量:6

Application of Data Mining in Spam Detection

在线阅读下载全文

作  者:林冬茂[1] 

机构地区:[1]湖州师范学院现代教育技术中心,浙江湖州313000

出  处:《计算机仿真》2012年第2期120-123,共4页Computer Simulation

摘  要:研究垃圾邮件检测准确性问题,提高网络安全。邮件特征具有高维、冗余量大,传统检测模型无法降低特征维数,冗余信息难以消除,导致计算时间长,空间复杂度大,垃圾检测正确率低等缺陷,为提高垃圾检测正确率,提出一种白名单和支持向量机相结合的两层垃圾邮件检测模型。采用聚类特征技术对特征进行聚类,降低特征维数,消除特征间冗余信息,将白名单检测技术作为垃圾检测系统第一道防线,检测已知地址垃圾邮件,支持向量机作为第二道防线,检测新的垃圾邮件,提高网络安全。采用垃圾邮件数据对模型性能进行检验,实验结果表明,两层垃圾邮件检测模型有效提高了垃圾邮件检测效率和正确率,为通信邮件管理提供了有效的手段。Research spam detection problems. Network security is improved. Mail has the features of high dimension, high redundancy, the traditional testing model cannot reduce the feature dimension and eliminate redundant information, leading to long computation time and space complexity. In order to improve the detection rate of garbage mails, the paper put forward a two layers spam detection model which combined white list with support vector ma- chine. Feature clustering technique was used to reduce cluster feature dimensions and eliminate redundant informa- tion. The white list detection technology was used as the first defense line of garbage detection system to detect the spams whose addresses were known. The support vector machine was used as the second defense line to, test new spasm and enhance the network security. The spam data were used to test the model's performance. The experimental results show that the two - layer spam detection model can effectively improve the spam detection efficiency and accu- racy, and has certain application value.

关 键 词:垃圾邮件 分类 支持向量机 特征选择 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象