变精度下不完备混合数据的增量式属性约简方法被引量：16

Incremental attribute reduction method for incomplete hybrid data with variable precision

作　　者：王映龙[1] 曾淇[1] 钱文彬舒文豪[3] 黄锦涛 WANG Yinglong;ZENG Qi;QIAN Wenbin;SHU Wenhao;HUANG Jintao(School of Computer and Information Engineering,Jiangxi Agricultural University,Nanchang Jiangxi 330045,China;School of Software,Jiangxi Agricultural University,Nanchang Jiangxi 330045,China;School of Information Engineering,East China Jiaotong University,Nanchang Jiangxi 330013,China)

机构地区：[1]江西农业大学计算机与信息工程学院,南昌330045 [2]江西农业大学软件学院,南昌330045 [3]华东交通大学信息工程学院,南昌330013

出　　处：《计算机应用》2018年第10期2764-2771,共8页journal of Computer Applications

基　　金：国家自然科学基金资助项目(61502213;61662023);江西省自然科学基金资助项目(20161BAB212049)~~

摘　　要：为了解决当不完备混合决策系统中数据动态增加时,静态属性约简方法的计算复杂度高的问题,提出变精度下不完备混合数据的增量式属性约简方法。首先,在变精度模型下给出了利用条件熵度量属性的重要性程度;然后,详细分析和设计了当数据动态增加时条件熵的增量式更新变化情况和属性约简的更新机制;在此基础上,利用启发式贪心策略构造了增量式的属性约简算法,实现了不完备的数值型和符号型混合数据下属性约简的动态更新。通过UCI数据集中五个真实的混合型数据集的实验比较和分析,在约简效果方面,利用增量式属性约简算法处理Echocardiogram、Hepatitis、Autos、Credit和Dermatology数据集的增量规模为90%+10%时,数据集的原属性个数分别由12、19、25、17和34个约简至6、7、10、11和13个,分别占原属性集的50. 0%、36. 8%、40. 0%、64. 7%和38. 2%;在执行时间方面,增量式算法在五个数据集的平均耗时分别为2. 99 s、3. 13 s、9. 70 s、274. 19 s和50. 87 s,静态算法的平均耗时分别为284. 92 s、302. 76 s、1062. 23 s、3510. 79 s和667. 85 s,且增量式算法的耗时与数据集的实例规模、属性个数和属性值类型的分布相关。实验结果表明,增量式属性约简算法在计算耗时方面要显著优于静态算法,且能有效剔除数据中的冗余属性。In order to deal with the highly computational complexity of static attribute reduction when the data increasing dynamically in incomplete hybrid decision system,an incremental attribute reduction method was proposed for incomplete hybrid data with variable precision.The important degrees of attributes were measured by conditional entropy in the variable precision model.Then the incremental updating of conditional entropy and the updating mechanism of attribute reduction were analyzed and designed in detail when the data is dynamically increased.An incremental attribute reduction method was constructed by heuristic greedy strategy which can achieve the dynamical updating of attribute reduction of incomplete numeric and symbolic hybrid data.Through the experimental comparison and analysis of five real hybrid datasets in UCI,in terms of the reduction effects,when the incremental size of the Echocardiogram,Hepatitis,Autos,Credit and Dermatology increased to 90%+10%,the original number of attributes is reduced from 12,19,25,17,34 to 6,7,10,11,13,which is accounted for 50.0%,36.8%,40.0%,64.7%,38.2%of the original attribute set;in terms of the execution time,the average time consumed by the incremental algorithm in the five datasets is 2.99,3.13,9.70,274.19,50.87 seconds,and the average time consumed by the static algorithm is 284.92,302.76,1 062.23,3 510.79,667.85 seconds.The time-consuming of the incremental algorithm is related to the distribution of the instance size,the number of attributes,and the attribute value type of the data set.The experimental results show that the incremental attribute reduction algorithm is significantly superior to the static algorithm in time-consuming,and can effectively eliminate redundant attributes.

关键词：粗糙集属性约简邻域关系增量式方法不完备混合数据

分类号：TP18[自动化与计算机技术—控制理论与控制工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

变精度下不完备混合数据的增量式属性约简方法被引量：16

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

变精度下不完备混合数据的增量式属性约简方法 被引量：16

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

变精度下不完备混合数据的增量式属性约简方法被引量：16