检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:杨立圣 罗文华[1] YANG Li-sheng;LUO Wen-hua(School of Public Security Information Technology and Intelligence,Criminal Investigation Police University of China,Shenyang 110035,China)
机构地区:[1]中国刑事警察学院公安信息技术与情报学院,沈阳110035
出 处:《小型微型计算机系统》2023年第4期875-880,共6页Journal of Chinese Computer Systems
基 金:国家重点研发计划项目(2018YFC0830600)资助;辽宁省“百千万人才工程”项目(2020921058)资助;中国刑事警察学院研究生创新能力提升项目(2022YCZD05)资助。
摘 要:传统恶意网页识别缺乏全局性、系统性考量,没有将网页作为有机整体,而是独立针对标签结构、URL地址、文本内容等特定层面特征开展研究,导致准确率较低.虽然已有学者提出融合特征思想,但依旧使用机器学习算法予以实现,特征工程工作量巨大,识别效率低下.针对上述问题,提出一种基于多特征融合的Tri-BERT-SENet模型,用于完成恶意网页的识别任务.利用获取得到的HTML特征、网页URL特征以及网页文本特征,结合BERT模型的上下文感知能力,将特征转化为3个BERT模型输出;之后将模型输出作为特征通道,使用SENet进行加权计算,最终输出识别结果.实验结果表明,与传统机器学习模型以及使用BERT对单一特征的识别方法相比,该检测方法在恶意网页识别的准确率上有较大提升.Traditional malicious web page recognition lacks global and systematic considerations,and does not take the web page as an organic whole.Instead,it conducts research on specific features such as tag structure,URL address and text content,resulting in low accuracy.Although some scholars have proposed the idea of feature fusion,they still use machine learning algorithm to realize it.The workload of feature engineering is huge and the recognition efficiency is low.In view of the above problems,a Tri-BERT-SENet model based on multi-feature combination is proposed to complete the task of detecting malicious web pages.Using the obtained HTML features,web page URL features and web page text features,combined with the context awareness ability of the BERT model,the features are converted into three BERT models′outputs;After that,the output of the model is taken as the feature channel,and the weighted calculation is carried out using SENet,and the detection result is finally output.The experimental results show that compared with the traditional machine learning model and the single feature detection method using BERT,the detection method has a great improvement in the accuracy of malicious web page detection.
分 类 号:TP393[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.125