检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:徐东[1] 王鑫[1] 孟宇龙[1] 张子迎[1] XU Dong;WANG Xin;MENG Yulong;ZHANG Ziying(School Computer Science and Technology, Harbin Engineering University, Harbin 150001, China)
机构地区:[1]哈尔滨工程大学计算机科学与技术学院,黑龙江哈尔滨150001
出 处:《西北工业大学学报》2020年第2期434-441,共8页Journal of Northwestern Polytechnical University
摘 要:多维属性离散化能提升机器学习算法训练的速度与精度,目前的离散化算法性能较低且多是单属性离散,忽略了属性之间的潜在关联。基于此,提出了一种基于森林优化的粗糙集离散化算法(a discretization algorithm based on forest optimization and rough set,FORDA)。该算法针对多维连续属性的离散化,依据变精度粗糙集理论,设计适宜值函数,进而构建森林寻优网络,迭代搜索最优断点子集。在UCI数据集上的实验结果表明,与当前主流的离散化算法相比,所提算法能避免局部最优,显著提升了SVM分类器的分类精度,其离散化性能更为优良,且具有一定的通用性,验证了算法的有效性。Discretization of multidimensional attributes can improve the training speed and accuracy of machine learning algorithm.At present,the discretization algorithms perform at a lower level,and most of them are single attribute discretization algorithm,ignoring the potential association between attributes.Based on this,we proposed a discretization algorithm based on forest optimization and rough set(FORDA)in this paper.To solve the problem of discretization of multi-dimensional attributes,the algorithm designs the appropriate value function according to the variable precision rough set theory,and then constructs the forest optimization network and iteratively searches for the optimal subset of breakpoints.The experimental results on the UCI datasets show that:compared with the current mainstream discretization algorithms,the algorithm can avoid local optimization,significantly improve the classification accuracy of the SVM classifier,and its discretization performance is better,which verifies the effectiveness of the algorithm.
关 键 词:离散化 森林优化 多维 变精度粗糙集 寻优网络 断点子集
分 类 号:TP311.1[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.133.149.165