检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:谢娟英[1] 吴肇中 郑清泉 王明钊[1,2] XIE Juan-Ying;WU Zhao-Zhong;ZHENG Qing-Quan;WANG Ming-Zhao(School of Computer Science,Shaanxi Normal University,Xi'an 710119;College of Life Sciences,Shaanxi Normal University,Xi'an 710119)
机构地区:[1]陕西师范大学计算机科学学院,西安710119 [2]陕西师范大学生命科学学院,西安710119
出 处:《自动化学报》2022年第5期1292-1306,共15页Acta Automatica Sinica
基 金:国家自然科学基金(62076159,12031010,61673251);中央高校基本科研业务费(GK202105003)资助。
摘 要:针对特征子集区分度准则(Discernibility of feature subsets,DFS)没有考虑特征测量量纲对特征子集区分能力影响的缺陷,引入离散系数,提出GDFS(Generalized discernibility of feature subsets)特征子集区分度准则.结合顺序前向、顺序后向、顺序前向浮动和顺序后向浮动4种搜索策略,以极限学习机为分类器,得到4种混合特征选择算法.UCI数据集与基因数据集的实验测试,以及与DFS、Relief、DRJMIM、mRMR、LLE Score、AVC、SVM-RFE、VMInaive、AMID、AMID-DWSFS、CFR和FSSC-SD的实验比较和统计重要度检测表明:提出的GDFS优于DFS,能选择到分类能力更好的特征子集.To overcome the deficiencies of the discernibility of feature subsets(DFS)which cannot take into account the influences from different attribute scales on the discernibility of a feature subset,the generalized DFS,shorted as GDFS,is proposed in this paper by introducing the coefficient of variation.The GDFS is combined with four search strategies,including sequential forward search(SFS),sequential backward search(SBS),sequential forward floating search(SFFS)and sequential backward floating search(SBFS)to develop four hybrid feature selection algorithms.The extreme learning machine(ELM)is adopted as a classification tool to guide feature selection process.We test the classification capability of the feature subsets detected by GDFS on the datasets from UCI machine learning repository and on the classic gene expression datasets,and compare the performance of the ELM classifiers based on the feature subsets by GDFS,DFS and classic feature selection algorithms including Relief,DRJMIM,mRMR,LLE Score,AVC,SVM-RFE,VMInaive,AMID,AMID-DWSFS,CFR,and FSSC-SD respectively.The statistical significance test is also conducted between GDFS,DFS,Relief,DRJMIM,mRMR,LLE Score,AVC,SVM-RFE,VMInaive,AMID,AMID-DWSFS,CFR,and FSSC-SD.Experimental results demonstrate that the proposed GDFS is superior to the original DFS.It can detect the feature subsets with much better capability in classification performance.
关 键 词:特征子集区分度 特征选择 离散系数 极限学习机 特征搜索策略
分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.15.26.71