机构地区:[1]安徽大学计算智能与信号处理教育部重点实验室,合肥230039 [2]安徽大学计算机科学与技术学院,合肥230601
出 处:《计算机科学》2020年第3期73-78,共6页Computer Science
基 金:国家自然科学基金(61402005);安徽省自然科学基金(1308085QF114);安徽省高等学校省级自然科学基金(KJ2013A015)~~
摘 要:运用可辨识矩阵表示信息系统中所有对象的区分信息,为研究属性约简提供了新方向。然而,传统的可辨识矩阵在构造结束后才利用核属性消除冗余元素项,忽略了核属性在矩阵构建过程中的作用。针对这一问题,文中做了以下研究:1)优化可辨识矩阵的构造方式,在计算任意两个对象的区分信息之前,先判断核属性上的取值是否相等,如果不相等,则直接将对应元素项记为Φ,忽略对其他条件属性的判断;2)提出属性加权重要度的概念,综合考虑每个条件属性占可辨识矩阵中非空元素项的比率(称为宏观重要度)与每个属性对区分对象的贡献程度(称为微观重要度),并通过例子说明了该度量方法的合理性;3)针对优化后的矩阵仍然存在大量冗余元素和空集这一缺陷,结合差别信息树的概念提出基于优化可辨识矩阵和属性加权重要度的差别信息树。按照属性加权重要度对优化可辨识矩阵中所有非空元素项进行排序,使得重要度高的属性被更多的节点共享;且在构建过程中将不包含核属性的元素项映射到树中的一条路径上,而包含核属性的元素项则被直接忽略。最后,提出基于优化可辨识矩阵和改进差别信息树的约简算法HSDI-tree。在UCI的5个数据集上分别比较了HSDI-tree算法与CDI-tree,DI-tree和IDI-tree算法的约简结果和节点个数,实验结果表明HSDI-tree算法能有效找到最小属性约简且空间压缩能力更好。Discernibility matrix expresses the distinguishing information of all objects in the information system with matrix elements,which provides a new idea for attribute reduction.However,the traditional discernibility matrix uses the core attributes to eliminate redundant element items after the construction is finished,ignoring the role of the core attributes in the matrix construction process.In response to this problem,the following research is done.Firstly,the definition of the discernibility matrix is optimized.Before calculating the distinguishing information of any two objects,it is first determined whether the values on the core attributes are equal.If not,the corresponding element items are directly recorded as Φ,and the judgment of other attributes is ignored.Secondly,the concept of attribute weighted importance is proposed.The ratio of each condition attribute to the non-empty element term in the discernibility matrix(called macro importance)and the contribution of each attribute to the distinguishing object(called micro Importance)are comprehensively considered,and the rationality of the measurement method is illustrated by an example.Thirdly,aiming at the disadvantages that there are a lot of redundant elements and empty sets in the optimized discernibility matrix,by combining the concept of discernibility information tree,discernibility information tree based on optimized discernibility matrix and attribute weighted importance is proposed.All non-empty element items in the optimized discernibility matrix are sorted according to attribute weighted importance,so that attributes with high importance are shared by more nodes.Element items that do not contain core attributes are mapped to a path in the tree during the build process,while element items that contain core attributes are ignored.Finally,a reduction algorithm HSDI-tree based on optimized discernibility matrix and improving discernibility information tree is proposed.This paper compared the reduction results and the number of nodes of the HSDI-tr
关 键 词:粗糙集 属性重要度 可辨识矩阵 属性约简 差别信息树
分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...