检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]华东师范大学计算机科学技术系,上海200062
出 处:《计算机应用与软件》2008年第12期260-262,共3页Computer Applications and Software
摘 要:C4.5算法在对连续值属性进行离散化处理时,需要计算所有潜在划分的信息增益,由于不能快速找到最佳划分点,因此在算法时间复杂度方面存在一定局限。基于Fayyad和Irani的证明,对C4.5算法在连续值属性离散化和连续值属性惩罚项等方面进行了改进。实验结果表明,改进算法能够从总体上提高算法执行效率,在降低算法的分类错误率方面也具有应用的潜力。In the situation of discretizing continuous-valued attributes, C4.5 algorithm needs to calculate the information gain of all potential cut-points. Because of failing to find the best cut points quickly, there are some limitations on the time complexity of algorithm. Based on the proof of Fayyad and Irani, in this paper the improved approaches for discretization and the penalty term of continuous-valued attributes of C4.5 algorithm were extended. Experiment results indicated that the improvement could significantly increase whole efficiency of the algorithm operation, and had potential usage on decreasing classify error rate in practice.
分 类 号:TP391[自动化与计算机技术—计算机应用技术] TP18[自动化与计算机技术—计算机科学与技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.249