检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张海军[1] 陈映辉[2] ZHANG Haijun;CHEN Yinghui(School of Computer,Jiaying University,Meizhou 514015,Guangdong,China;School of Mathematics,Jiaying University,Meizhou 514015,Guangdong,China)
机构地区:[1]嘉应学院计算机学院,广东梅州514015 [2]嘉应学院数学学院,广东梅州514015
出 处:《山东大学学报(工学版)》2020年第2期118-128,共11页Journal of Shandong University(Engineering Science)
基 金:国家自然科学基金资助项目(61171141,61573145);广东省自然科学基金重点资助项目(2014B010104001,2015A030308018);广东省普通高等学校人文社会科学省市共建重点研究基地资助项目(18KYKT11);广东省嘉应学院自然科学基金重点资助项目(2017KJZ02)。
摘 要:基于语义情景分析及向量化对访问流量语料库大数据进行词向量化处理,实现面向大数据跨站脚本攻击智能检测。利用自然语言处理方法进行数据获取、数据清洗、数据抽样、特征提取等数据预处理。设计基于神经网络的词向量化算法,实现词向量化得到词向量大数据;通过理论分析和推导,实现多种不同深度的长短时记忆网络智能检测算法。设计不同的超参数并进行反复试验,分别得到最大识别率为0.999 5、最低识别率为0.264 3、识别率均值为99.88%、方差为0、标准差为0.000 4的识别率变化过程曲线图、损失误差变化过程曲线图、词向量样本余弦距离变化曲线图和平均绝对误差变化过程曲线图等。研究结果表明该算法有高识别率、稳定性强、总体性能优良等优点。The access traffic corpus big data were processed with word vectorization based on the methods of semantic scenario analysis and vectorization, and the intelligent detection oriented to big data cross-site scripting attack was realized. It used the natural language processing methods for data acquisition, data cleaning, data sampling, feature extraction and other data preprocessing. The algorithm of word vectorization based on neural network was used to realize word vectorization and get big data of word vectorization. Through theoretical analysis and deductions, the intelligent detection algorithms of varieties of long short term memory networks with different layers were realized. With different hyperparameters and repeated tests, lots of results were got, such as the highest recognition rate for 0.999 5, the minimum recognition rate for 0.264 3, average recognition rate for 99.88%, variance for 0, standard deviations for 0.000 4, the curve diagram of recognition rates change, the curve diagram of error of loss change, the curve diagram of cosine proximity change of word vector samples and the curve diagram of mean absolute error change etc. The results of the study showed that the algorithm had the advantages of high recognition rates, strong stability and excellent overall performance, etc.
关 键 词:网络入侵检测 跨站脚本攻击 自然语言处理 深度长短时记忆网络 大数据
分 类 号:TP309.2[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.145