基于mRMR与基尼重要性的树突状细胞模型  被引量:2

Dendritic Cell Model Based on mRMR and Gini Importance

在线阅读下载全文

作  者:张凯林 董红斌[1] ZHANG Kailin;DONG Hongbin(Key Laboratory of Aerospace Information Security and Trusted Computing,Ministry of Education,School of Cyber Science and Engineering,Wuhan University,Wuhan 430072,China)

机构地区:[1]武汉大学国家网络安全学院空天信息安全与可信计算教育部重点实验室,武汉430072

出  处:《计算机工程》2023年第5期129-138,共10页Computer Engineering

基  金:国家自然科学基金“计算机免疫智能的连续应答机制及其应用”(61877045)。

摘  要:树突状细胞算法(DCA)模拟人体免疫系统中树突状细胞对抗原的识别与提呈过程,是一种快速有效的异常检测方法,其关键是从数据中选取有效特征以表示特定的输入信号。然而,现有信号选取方法存在特征子集冗余、时间复杂度高等问题,导致生成的抗原信号有效性较低,且在高维大样本数据集上运行速度较慢。考虑抗原信号的可用性与信号选取过程的时间效率,提出基于最大相关最小冗余(mRMR)与基尼重要性的树突状细胞模型MRGI-DCA。通过mRMR从原始数据集中快速地提取最相关特征子集,且最大限度地降低特征子集的冗余性。在mRMR预降维的基础上,根据CART树模型快速、准确等特点,利用基尼重要性得到更有效的抗原信号。实验结果表明,MRGI-DCA总体表现优于IG-DCA、COR-DCA、GA-DCA和SVM-DCA方法,其中,准确率、F1值和AUC在高维、低维、异常数据集上的平均值较COR-DCA分别提高6.01%、5.86%、9.96%,并且平均运行时间约为COR-DCA的1/5。The Dendritic Cell Algorithm(DCA)simulates the recognition and presentation of antigens by Dendritic Cells(DC)in the human immune system.It is a fast and effective anomaly detection method.It selects data features that represent specific input signals.However,existing signal selection methods have feature subset redundancy and high time complexity,resulting in the low effectiveness of the generated antigen signal and low running speed on high-dimensional and large-sample data sets.Considering the availability of antigen signals and time efficiency during signal selection,a DC model MRGI-DCA based on maximal Relevance Minimal Redundancy(mRMR),and Gini Importance(GI)is proposed.The most relevant feature subset is extracted quickly from the original data set through mRMR,and the redundancy of the feature subset is minimized.Based on the pre-dimensionality reduction of mRMR,according to the fast and accurate characteristics of a CART tree model,more effective antigen signals can be obtained using the GI.Experimental results show that MRGI-DCA outperforms the IG-DCA,COR-DCA,GA-DCA,and SVM-DCA.The accuracy,F1 value,and AUC average values are 6.01%,5.86%,and 9.96%higher than those of COR-DCA,respectively,for high-dimensional,low-dimensional,abnormal data sets,and the average running time is approximately 1/5 that of the COR-DCA.

关 键 词:树突状细胞算法 信号选取 最大相关最小冗余算法 基尼重要性 人工免疫系统 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象