基于样本选取的决策树改进算法  被引量:18

Improved Decision Tree Algorithm Based on Samples Selection

在线阅读下载全文

作  者:冯少荣[1,2] 肖文俊[2] 

机构地区:[1]厦门大学信息科学与技术学院,福建厦门361005 [2]华南理工大学计算机科学与工程学院,广东广州510640

出  处:《西南交通大学学报》2009年第5期643-647,共5页Journal of Southwest Jiaotong University

基  金:福建省自然科学基金资助项目(A0310008);福建省高新技术研究开放计划重点项目(2003H043)

摘  要:为提高决策树分类算法的精度,通过比较几种经典的决策树分类算法,提出了基于样本选取的改进的决策树分类算法.改进算法基于决策树精度与样本的相关性较大以及决策树只能得到局部最优解的事实,通过反复迭代寻找较优样本,从而在不改变决策树分类算法的前提下,得到较好的决策树分类算法.该算法不针对某个决策树,只利用输入和输出的反馈信息进行迭代,因此通用性较好.实验证明,该改进算法与ID3,C4.5算法平均错误率的比值约为0.82∶1.22∶0.92.To raise the accuracy of decision tree classification algorithms, an improved decision tree classification algorithm based on samples selection was proposed by comparing several classical decision tree classification algorithms. This improved algorithm searches better samples through a constantly iterative process based on the facts that the correlation between decision trees' accuracy and samples is large and decision trees can only get a local optimal solution. As a result, a better decision tree classification algorithm can be obtained under the condition of not changing the decision tree classification algorithm. The improved algorithm is not aiming at a decision tree and it carries through iteration only based on some feedback information of input and output, so its universality is better. Experimental results show that the ratio of the ID3, C4.5 algorithms is about 0.82 to 1.22 to average error rates of the improved algorithm and the 0. 92.

关 键 词:决策树 样本选取 ID3算法  分类 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象