基于行为与时间特征的垃圾邮件检测方法  被引量:1

Method for email spam detection based on behavioral and temporal features

在线阅读下载全文

作  者:邵叶秦[1,2] 施佺[1] 

机构地区:[1]南通大学现代教育技术中心,江苏南通226019 [2]上海交通大学电信学院,上海200240

出  处:《解放军理工大学学报(自然科学版)》2013年第5期494-500,共7页Journal of PLA University of Science and Technology(Natural Science Edition)

基  金:国家自然科学基金资助项目(61171132);江苏省自然科学基金资助项目(BK2010280);南通市应用研究计划资助项目(BK2011003;BK2012034;BK2012001);南京市科技平台计划资助项目(CP2013001)

摘  要:垃圾邮件数量庞大、伪装形式多种多样,给反垃圾邮件带来了巨大的挑战。提出了一个基于行为和时间特征的垃圾邮件检测方法。根据邮件收发记录分析基于社会网络的行为特征和基于邮件发送间隔的时间特征,采用步进式判别分析方法,选择具有较强判别能力的行为特征,形成特征子空间,将训练样本投影到特征子空间。使用带标签的训练样本训练支持向量机SVM,形成邮件决策信息,以此检测出垃圾邮件。利用最近3年真实邮件数据,从不同的角度进行了对比实验。结果证明,提出的行为与时间特征能有效提升垃圾邮件检测的准确率和查全率,其整体性能优于其他的基于行为的垃圾邮件检测方法。The large number of email spam and their various counterfeits pose a great challenge to antispam.An email spam detection method was proposed based on behavioral and temporal features.According to the emails' sending and receiving records,behavioral features were analyzed based on email social network and a temporal feature analyzed based on email delivery interval.Then,a stepwise discriminant analysis was employed to select discriminative features to form a feature sub-space where all training samples were projected into this feature sub-space.Finally,those projected training samples with labels were used to train the support vector machine (SVM) classifier,and decision criteria were generated so as to identify email spam.Based on real email data in recent 3 years,comparative experiments were performed to evaluate the effectiveness of the features and performance of the proposed method.Experimental results show that the behavioral and temporal features proposed in this paper can significantly increase the accuracy and recall of spam detection,and that the overall performance of this method is superior to that of other email spam detection methods which are based on behavioral features.

关 键 词:社会网络 垃圾邮件 特征选择 支持向量机 

分 类 号:TP309[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象