基于Stacking集成学习的恶意URL识别方法  

Malicious URL Recognition Method Based on Stacking Ensemble Learning

在线阅读下载全文

作  者:孙杨[1] 邱祥锋[2] SUN Yang;QIU XiangFeng(College of Computer Engineering,Jimei University,Xiamen 361021,China;Xiamen Kingtop Information Technology Co.,Ltd.,Xiamen 361021,China)

机构地区:[1]集美大学计算机工程学院,福建厦门361021 [2]厦门精图信息技术有限公司,福建厦门361021

出  处:《集美大学学报(自然科学版)》2025年第2期179-185,共7页Journal of Jimei University:Natural Science

基  金:福建省自然科学基金项目“大规模图数据的自适应分布式存储与查询技术研究”(2022J01336)。

摘  要:针对传统URL(uniform resource locator)检测方法在恶意URL检测时存在的精确率不高、实时性差等问题,提出一种基于Stacking集成学习的算法模型。该模型用ADB(adaptive boosting)、LR(logistic regression)、SVM(support vector machine)、GBDT(gradient boosting decision tree)和GNB(gaussian naive bayes)5种机器学习算法作为初级分类器,其多层结构使不同机器学习模型之间可以优势互补,提升检测系统的整体性能表现。最后,通过在测试集上进行性能评估,选出性能最优的集成组合。实验结果表明,基于Stacking方法融合基学习器的集成学习模型在召回率、准确率、精确率、F 1值等多项指标上优于传统机器学习模型,对恶意URL检测的准确率可达96.77%。In allusion to the problems of traditional URL detection methods such as low accuracy and poor real-time performance in detecting malicious URLs,an algorithm model based on Stacking ensemble learning is proposed,which uses five machine learning models:ADB,LR,SVM,GBDT and GNB as primary classifiers.Its pluralistic structure enables different machine learning models to complement each other and improve detection Overall system performance.The performance evaluation is performed on the test set in turn,and the best performance is selected.The experimental results indicate that on many metrics,such as recall,accuracy,precision,F 1 value,the overall performance of integrated learning models is better than the traditional machine learning models,the accuracy of malicious URL detection can reach 96.77%.

关 键 词:恶意URL 机器识别 Stacking模型 集成学习 基学习器 

分 类 号:TP393[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象