S-C特征提取的计算机漏洞自动分类算法  被引量:3

Automatic Classification of Computer Vulnerability Based on S-C Feature Extraction

在线阅读下载全文

作  者:任家东[1,2] 王倩 王菲[1,2] 李亚洲 刘佳新 REN Jiadong;WANG Qian;WANG Fei;LI Yazhou;LIU Jiaxin(College of Information Science and Engineering,Yanshan University,Qinhuangdao,Hebei 066001,China;Computer Virtual Technology and System Integration Laboratory of Hebei Province,Qinhuangdao,Hebei 066001,China)

机构地区:[1]燕山大学信息科学与工程学院,河北秦皇岛066001 [2]河北省计算机虚拟技术与系统集成实验室,河北秦皇岛066001

出  处:《计算机科学与探索》2020年第7期1173-1182,共10页Journal of Frontiers of Computer Science and Technology

基  金:国家自然科学基金Nos.61472341,61772449,61572420,61807028,61802332。

摘  要:近年来未知的计算机漏洞数量呈海量增长状态,对于大量的漏洞数据进行及时准确的分析和分类管理,是十分重要且有待解决的问题。因此,提出一种基于信息熵与综合函数(S-C)特征提取,并利用关联了特征词集间相互关系的平均一阶依赖贝叶斯模型(AODE)分类器的分类方法对计算机漏洞描述信息进行文本分类。首先,利用S-C特征提取法提取特征词。通过结合词语的类间重要程度和类内重要程度的综合函数C,计算出词语对于类别的重要程度。再利用词语对于类别间的信息熵S,来弱化对于分类较为混乱的词语的重要程度,选取得到准确的特征词集。最后,利用关联了特征词集间相互关系的AODE对漏洞数据集进行分类。通过实验对比表明,S-C特征提取法能够提取准确的特征词集,并且结合AODE分类器的分类准确率要高于传统的分类器模型。In recent years,the number of unknown computer vulnerabilities has increased rapidly.It is an important and unsolved problem for analyzing and classifying a large number of vulnerability data timely and accurately.Therefore,this paper proposes a text classification method for computer vulnerability description information based on information entropy and comprehensive function(S-C)feature extraction and combines the averaged onedependence estimators(AODE)classifier.First,the feature words are extracted by the S-C feature extraction method.By combining the comprehensive function C of the importance degree between classes and within classes of words,the importance degree of words to classes is calculated.Then,the information entropy S of words to classes is used to weaken the importance of words with chaotic classification and an accurate feature set is selected.Finally,the vulnerability data set is classified by using AODE which relates the relationship between feature word sets.The experimental comparison shows that the S-C feature extraction method can extract the accurate feature word set,and the classification accuracy combined with AODE classifier is higher than traditional classifier model.

关 键 词:计算机漏洞 文本分类 特征提取 信息熵 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象