基于互信息和遗传算法的特征选择算法  被引量:2

Feature Selection Algorithm Based on Mutual Information and Genetic Algorithm

在线阅读下载全文

作  者:张婧[1] 曹峰[2] 董毓莹 张超[2] 余银中 唐超[4] ZHANG Jing;CAO Feng;DONG Yuying;ZHANG Chao;YU Yinzhong;TANG Chao(Department of Mathematics,Taiyuan University,Taiyuan 030032,China;School of Computer and Information Technology,Shanxi University,Taiyuan 030006,China;Anhui Huaye New Energy Technology,Hefei 230601,China;School of Artificial Intelligence and Big Data Studies,Hefei College,Hefei 230601,China)

机构地区:[1]太原学院数学系,山西太原030032 [2]山西大学计算机与信息技术学院,山西太原030006 [3]安徽华治新能源科技有限公司,安徽合肥230601 [4]合肥学院人工智能与大数据学院,安徽合肥230601

出  处:《山西大学学报(自然科学版)》2024年第1期1-8,共8页Journal of Shanxi University(Natural Science Edition)

基  金:国家自然科学基金(62072291,62272284);安徽省自然科学基金(2008085MF202)。

摘  要:本文提出了一种新的基于互信息和遗传算法的监督、封装型特征选择算法。该算法设计了基于互信息的特征之间以及特征与类之间的相关性度量指标,并结合遗传算法具有的较强的全局寻优能力,在候选特征空间中寻找特征间相关性低,特征与类相关性高且分类精度高的全局最优特征子集。本文在10个标准数据集上,与8个基于相关性的特征选择算法进行了对比实验。在3个分类器下,本文算法对应的平均分类精度分别为88.98%,87.5%和86.95%,优于所有对比算法。结果表明,本文算法可以有效降低原始特征集的维数并提升分类器的精度。A novel feature selection algorithm using mutual information and genetic algorithm is presented in this paper.The algorithm designed the metrics for measuring the correlation between features and that between features and classes based on mutual information.By combining the strong global optimization capability of genetic algorithms,it can search for a globally optimal feature subset in the candidate feature space,characterized by low inter-feature correlation,high feature-to-class correlation,and high classification accuracy.In this paper,comparative experiments were conducted on 10 standard datasets using 8 correlation-based feature selection algorithms.Under 3 classifiers,the algorithm proposed in this paper achieves average classification accuracies of 88.98%,87.5%,and 86.95%,respectively,outperforming all the comparative algorithms.The experimental outcomes demonstrate the effectiveness of the proposed algorithm in significantly reducing the dimensionality of the original feature sets while enhancing the accuracies ofclassifiers.

关 键 词:特征选择 相关性  互信息 遗传算法 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象