决策树ID3算法的分析与改进  被引量:38

Analysis and improvement of ID3 decision tree algorithm

在线阅读下载全文

作  者:王小巍[1] 蒋玉明[1] 

机构地区:[1]四川大学计算机学院,四川成都610064

出  处:《计算机工程与设计》2011年第9期3069-3072,3076,共5页Computer Engineering and Design

摘  要:为了弥补ID3算法[1-3]的缺点及不足,设计了一种基于ID3算法的改进算法。它使用修正参数修正信息增益,克服了ID3算法偏向于选择取值较多的属性这一缺点,对连续值的属性进行离散化,解决了连续属性的处理问题,通过有未知值的样本是按照已知值的相对频率随机分布的思想,可以处理缺少属性值的样本。描述了通过改进的ID3算法生成决策树[4]的具体步骤,将改进算法应用到了客户关系管理系统中的客户流失分析问题当中。通过对实验结果的分析比较,得到改进算法与原ID3算法相比具有更高的预测准确率,表明了该算法的有效性。According to the shortcomings of the ID3 algorithm,an improved algorithm is designed based on the ID3 algorithm.This algorithm correct the information gain by using a modified parameter and overcome the disadvantage that bais to select the attribute has more value and the discrete of continuous properties to solve the problem of the continuous attributes.As for the idea that a sample of unknown value is in accordance with the known values of the relative frequency of random,It can deal with the missing attribute values of the sample.Last described the steps that how to generate decision tree by the modified ID3 algorithm.The improved algorithm is applied to the analysis of customer lost in the customer relationship management system.Through the comparison of the experimental results,the improved algorithm has a higher forecast accuracy than the original ID3 algorithm.Finally,the feasibility of the method is validated by practical application.

关 键 词:数据挖掘 决策树 ID3算法 聚类 剪枝 

分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象