机构地区:[1]School of Economics and Management,Harbin Engineering University,Harbin 150001,China [2]School of Computer Science and Technology,Harbin Institute of Technology,Harbin 150001,China [3]College of Science,Harbin Engineering University,Harbin 150001,China
出 处:《Chinese Journal of Electronics》2017年第5期999-1007,共9页电子学报(英文版)
基 金:supported by the National Natural Science Foundation of China(No.51409065,No.71101034);the Heilongjiang Provincial Natural Science Foundation(No.JJ2016QN0048);the Heilongjiang Provincial Young Science Foundation(No.JJ2016QN0645);the Heilongjiang Provincial Postdoctoral Fund(No.LBH-Z15047)
摘 要:The simplicity and interpretability of decision tree induction makes it one of the more widely used machine learning methods for data classification.However,for continuous valued(real and integer) attribute data,there is room for further improvement in classification accuracy,complexity,and tree scale.We propose a new K-ary partition discretization method with no more than K –1 cut points based on Gaussian membership functions and the expected class number.A new K-ary crisp decision tree induction is also proposed for continuous valued attributes with a Gini index,combining the proposed discretization method.Experimental results and non-parametric statistical tests on 19 real-world datasets showed that the proposed algorithm outperforms four conventional approaches in terms of both classification accuracy,tree scale,and particularly tree depth.Considering the number of nodes,the proposed methods decision tree tends to be more balanced than in the other four methods.The complexity of the proposed algorithm was relatively low.The simplicity and interpretability of decision tree induction makes it one of the more widely used machine learning methods for data classification.However,for continuous valued(real and integer) attribute data,there is room for further improvement in classification accuracy,complexity,and tree scale.We propose a new K-ary partition discretization method with no more than K –1 cut points based on Gaussian membership functions and the expected class number.A new K-ary crisp decision tree induction is also proposed for continuous valued attributes with a Gini index,combining the proposed discretization method.Experimental results and non-parametric statistical tests on 19 real-world datasets showed that the proposed algorithm outperforms four conventional approaches in terms of both classification accuracy,tree scale,and particularly tree depth.Considering the number of nodes,the proposed methods decision tree tends to be more balanced than in the other four methods.The complexity of the proposed algorithm was relatively low.
关 键 词:Decision tree K-ary decision tree Continuous valued attribute Gaussian membership function
分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...