基于海量数据的不平衡SVM增量学习的钓鱼网站检测方法  被引量:1

Detection method of phishing website based on imbalance SVM-incremental learning of massive data

在线阅读下载全文

作  者:叶志雄[1] 王丹弘 YE Zhi-xiong;WANG Dan-hong(China Mobile Group Guangdong Co., Ltd., Guangzhou 510625, China)

机构地区:[1]中国移动通信集团广东有限公司,广州510635

出  处:《电信工程技术与标准化》2016年第12期26-31,共6页Telecom Engineering Technics and Standardization

摘  要:钓鱼网站每年在电子商务、通信、银行等领域给用户造成极大损失,成功有效的防范钓鱼网站成为一项艰巨任务。本文通过对实际数据的分析,提取了URL相关特点、网页文本内容两方面特征描述网页,然后对不同特征构建相应分类器,根据增量学习思想优化各分类器,提升算法在线学习能力。最后采用分类集成的方法综合各个分类器的预测结果,达到对钓鱼网站在线智能检测的目标。实验表明,集成分类具有良好的在线学习能力和泛化能力。For each year, phishing website in electronic commerce, communications, banking and other areasto give users a great loss, so successfully and effectively prevent phishing website become a diffi culttask. In this paper, through the analysis of the actual data, extracts 2 kinds of characteristics such as thecharacteristics of URL, webpage text content to describe the page, classifiers are then built based onthese different feature representations, and optimized based on the theory of incremental learning, theonline learning ability of the algorithm is improved. Finally, the classifi cation ensemble method is usedto synthesize the prediction results of each classifi er, which can achieve the goal of online intelligentdetection for phishing website. According to the experimental results, the ensemble classifi cation hasgood online learning ability and generalization ability.

关 键 词:增量学习 钓鱼网站 不平衡SVM方法 集成分类 

分 类 号:TP393.092[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象