检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李元江 权金升 谭阳奕 杨田 LI Yuanjiang;QUAN Jinsheng;TAN Yangyi;YANG Tian(Hunan Provincial Key Laboratory of Intelligent Computing and Language Information Processing(Hunan Normal University),Changsha Hunan 410081,China)
机构地区:[1]智能计算与语言信息处理湖南省重点实验室(湖南师范大学),长沙410081
出 处:《计算机应用》2023年第5期1467-1472,共6页journal of Computer Applications
基 金:湖南省自然科学优秀青年基金资助项目(2021JJ20037);长沙市杰出创新青年培养计划项目(kq1905031)。
摘 要:针对数据维度过高、冗余信息过多导致维度灾难的问题,提出一种基于异同矩阵的高维属性约简算法(ARSDM)。该算法在区分矩阵的基础上加入对同类样本的相似度衡量,形成对所有样本的综合评估。首先,计算样本在每个属性下的距离,并基于这些距离得到同类相似度和异类差异度;其次,建立异同矩阵,形成对整个数据集的评价;最后,进行属性约简,即将异同矩阵的每一列求和,依次选择值最大的特征进行约简,并将相应样本对的行向量置为零向量。实验结果表明,与经典属性约简算法DMG(Discernibility Matrix based on Graph theory)、FFRS(Fitting Fuzzy Rough Sets)以及GBNRS(Granular Ball Neighborhood Rough Sets)相比,在分类回归树(CART)分类器下,ARSDM的平均分类准确率分别提高了1.07、6.48、8.92个百分点;在支持向量机(SVM)分类器下,ARSDM的平均分类准确率分别提高了1.96、11.96、12.39个百分点;运行效率上ARSDM优于GBNRS和FFRS。可见,ARSDM能够有效去除冗余信息,提高分类准确率。Concerning of the curse of dimensionality caused by too high data dimension and redundant information,a high-dimensional Attribute Reduction algorithm based on Similarity and Difference Matrix(ARSDM)was proposed.In this algorithm,on the basis of discernibility matrix,the similarity measure for samples in the same class was added to form a comprehensive evaluation of all samples.Firstly,the distances of samples under each attribute were calculated,and the similarity of same class and the difference of different classes were obtained based on these distances.Secondly,a similarity and difference matrix was established to form an evaluation of the entire dataset.Finally,attribute reduction was performed,i.e.,each column of the similarity and difference matrix was summed,the feature with the largest value was selected into the reduction in proper order,and the row vector of the corresponding sample pair was set to the zero vector.Experimental results show that compared with the classical attribute reduction algorithms DMG(Discernibility Matrix based on Graph theory),FFRS(Fitting Fuzzy Rough Sets)and GBNRS(Granular Ball Neighborhood Rough Sets),the average classification accuracy of ARSDM is increased by 1.07,6.48,and 8.92 percentage points respectively under the Classification And Regression Tree(CART)classifier,and increased by 1.96,11.96,and 12.39 percentage points under the Support Vector Machine(SVM)classifier.At the same time,ARSDM outperforms GBNRS and FFRS in running efficiency.It can be seen that ARSDM can effectively remove redundant information and improve the classification accuracy.
关 键 词:异同矩阵 区分矩阵 属性约简 粗糙集 粒计算 数据挖掘
分 类 号:TP181[自动化与计算机技术—控制理论与控制工程] TP311.13[自动化与计算机技术—控制科学与工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.130