一种改进的C4.5算法及实验分析  被引量:13

AN IMPROVED C4.5 ALGORITHM AND EXPERIMENT ANALYSIS

在线阅读下载全文

作  者:刘佳[1] 王新伟[1] 

机构地区:[1]华东师范大学计算机科学技术系,上海200062

出  处:《计算机应用与软件》2008年第12期260-262,共3页Computer Applications and Software

摘  要:C4.5算法在对连续值属性进行离散化处理时,需要计算所有潜在划分的信息增益,由于不能快速找到最佳划分点,因此在算法时间复杂度方面存在一定局限。基于Fayyad和Irani的证明,对C4.5算法在连续值属性离散化和连续值属性惩罚项等方面进行了改进。实验结果表明,改进算法能够从总体上提高算法执行效率,在降低算法的分类错误率方面也具有应用的潜力。In the situation of discretizing continuous-valued attributes, C4.5 algorithm needs to calculate the information gain of all potential cut-points. Because of failing to find the best cut points quickly, there are some limitations on the time complexity of algorithm. Based on the proof of Fayyad and Irani, in this paper the improved approaches for discretization and the penalty term of continuous-valued attributes of C4.5 algorithm were extended. Experiment results indicated that the improvement could significantly increase whole efficiency of the algorithm operation, and had potential usage on decreasing classify error rate in practice.

关 键 词:C4.5 划分点 离散化 惩罚项 

分 类 号:TP391[自动化与计算机技术—计算机应用技术] TP18[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象