检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:顾翔元 郭继昌 李重仪 肖利军 Gu Xiangyuan;Guo Jichang;Li Chongyi;Xiao Lijun(School of Electrical and Information Engineering,Tianjin University,Tianjin 300072,China)
机构地区:[1]天津大学电气自动化与信息工程学院,天津300072
出 处:《天津大学学报(自然科学与工程技术版)》2021年第2期214-220,共7页Journal of Tianjin University:Science and Technology
基 金:国家自然科学基金资助项目(61771334).
摘 要:由于在评价冗余特征时只考虑对称不确定性或最大信息系数等某一种度量标准,使得现有的一些特征子集选择算法存在性能不理想的问题.针对该问题,提出了一种基于对称不确定性和三路交互信息的特征子集选择算法.首先,计算特征与类标签的对称不确定性,按照其值大小对特征作降序排序处理,并消除不相关特征;然后,计算特征间的对称不确定性以及特征与类标签的三路交互信息,并与特征与类标签的对称不确定性一起,经过比较和排序等运算以消除冗余特征而得到选取的特征.在评价冗余特征上同时考虑对称不确定性和三路交互信息两种度量标准,并结合比较和排序等运算,可以减少将相关特征当作冗余特征而消除的情况,使得一些效果显著的相关特征得以保留.为验证所提算法的性能,采用J48、IB1和Naïve Bayes 3种分类器将其与另外4种特征子集选择算法在3个UCI数据集和9个ASU数据集上进行实验.实验结果表明,所提算法能够在选取特征数和用时均较少的情况下取得很好的特征选择效果.It is known that only one metric is considered for evaluating redundant features such as symmetric uncertainty or maximum information coefficient and existing feature subset selection algorithms used for evaluation are not able to deliver the desired results.So our objective is to solve this problem and a feature subset selection algorithm based on symmetric uncertainty and three-way interaction information(SUTII)is proposed.First,symmetric uncertainty between features and the class label is evaluated,and features are arranged in descending order by ranking,and irrelevant features are removed.Then three-way interaction information among features and the class label and symmetric uncertainty between features are calculated and they are used jointly with symmetric uncertainty between features and the class label in a way of comparison and ranking calculation to remove redundant features.In this study,evaluating redundant features,both three-way interaction information and symmetric uncertainty are considered,and comparison and ranking calculation are adopted.The simulation that relevant feature are considered as redundant features and removed is decreased and some informative relevant features are retained.For validating the performance,SUTII is compared with four feature subset selection algorithms.Three classifiers J48,IB1,Naïve Bayes,three UCI datasets,and nine ASU datasets are used in the experiment.Experimental results demonstrate that SUTII can achieve better feature selection performance by means of few selected features and by consuming less time.
关 键 词:特征子集选择 三路交互信息 对称不确定性 特征选择 排序
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.147.75.50