检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]江南大学物联网工程学院,江苏无锡214112
出 处:《计算机工程与设计》2016年第5期1265-1270,1361,共7页Computer Engineering and Design
摘 要:对传统C4.5算法的运算效率和属性选择准确性进行研究,对其进行改进。运用泰勒级数和等价无穷小的原理对算法的计算公式进行简化,提高运算效率;在简化后的信息增益率计算公式中引入其它非类属性对于该属性的GINI指数的均值,用于调整因非类属性间冗余度问题导致的误差,提高算法属性选择的准确性,将改进后的算法称为G_C4.5。对G_C4.5、传统C4.5算法与其它改进算法进行对比实验分析,分析结果表明,G_C4.5算法在分类效率和准确性上都有一定提高。After researching the computing efficiency and attribute selection accuracy of traditional C4.5algorithm,some improvements were implemented.The calculation formula was simplified using the principle of Taylor series and equivalent infinitesimal,the efficiency of calculation was improved.The average value of GINI index of non-class attributes for this attribute was introduced to the simplified formula of information gain rate,the deviation caused by the redundancy between non-class attributes was adjusted,and the accuracy of the attribute selection was improved.The improved algorithm was named as G_C4.5.G_C4.5algorithm was contrasted with traditional C4.5algorithm and its other improved algorithms,results show that G_C4.5algorithm improves the classification efficiency and the classification accuracy.
关 键 词:C4.5算法 泰勒级数 等价无穷小 GINI指数的均值 非类属性间关联性 G_C4.5算法
分 类 号:TP311.5[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.28