基于自信息和模糊邻域条件熵的特征选择方法

Feature Selection Method Based on Self-information and Fuzzy Neighborhood Conditional Entropy

作　　者：徐久成段江豪牛武林张杉白晴 XU Jiucheng;DUAN Jianghao;NIU Wulin;ZHANG Shan;BAI Qing(College of Computer and Information Engineering,Henan Normal University,Xinxiang 453007,China;Engineering Lab of Intelligence Business&Internet of Things of Henan Province,Xinxiang 453007,China)

机构地区：[1]河南师范大学计算机与信息工程学院,河南新乡453007 [2]智慧商务与物联网技术河南省工程实验室,河南新乡453007

出　　处：《山西大学学报(自然科学版)》2025年第1期77-88,共12页Journal of Shanxi University(Natural Science Edition)

基　　金：国家自然科学基金(61976082,62076089,62002103)。

摘　　要：针对模糊邻域粗糙集的特征选择方法通常仅考虑下近似中的分类信息,而忽略上近似和边界域中的分类信息这一问题,本文提出了一种基于自信息和模糊邻域条件熵的特征选择算法。首先,结合下近似、上近似和边界域提出了三种自信息不确定性测度,并将三种自信息相结合提出了相似自信息。其次,在信息论视角下,给出了模糊邻域条件熵的不确定性度量,并将其与相似自信息相结合,提出了更为全面的特征评价函数,用于衡量特征子集分类信息的不确定性,并基于此利用最大相关最小冗余技术设计特征选择算法。最后,通过在数据集上进行对比实验,实验结果表明所提算法能有效处理上近似和边界域中的分类信息;且所提算法在两个分类器下其平均分类精度,在低维数据集中分别提高了2.55%和4.15%,在高维数据集中分别提高了0.83%和2.54%。The feature selection method for fuzzy neighborhood rough sets usually only considers the classification information in the approximation,but cannot evaluate the classification information in the approximation and boundary domains.In this paper,we propose a feature selection algorithm based on self-information measure and fuzzy neighborhood conditional entropy.Firstly,three measures of self-information uncertainty are proposed by combining the lower approximation,the upper approximation and the boundary domain,and the similarity self-information is proposed by combining the three types of self-information.Secondly,from the perspective of information theory,the uncertainty measure of fuzzy neighborhood conditional entropy is given,and combined with similar self-information,a more comprehensive feature evaluation function is proposed to measure the uncertainty of feature subset classification information,and based on this,the feature selection algorithm is designed by using the maximum correlation and minimum redundancy technology.Finally,through comparative experiments on the dataset,the results show that the proposed algorithm can effectively process the classification information in the approximation and boundary domains;and under the two classifiers of the proposed algorithm,its average classification accuracy is improved by 2.55%and 4.15%,respectively,in the low-dimensional data set compared with the existing algorithms,and is improved by 0.83% and 2.54%,respectively,in the high-dimensional data set.

关键词：模糊邻域粗糙集自信息不确定性度量模糊邻域熵模糊邻域条件熵

分类号：TP181[自动化与计算机技术—控制理论与控制工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于自信息和模糊邻域条件熵的特征选择方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于自信息和模糊邻域条件熵的特征选择方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索