基于属性依赖度计算和PCA的C4.5算法  被引量:5

C4.5 algorithm based on attribute dependency calculation and PCA

在线阅读下载全文

作  者:黄秀霞[1] 孙力[1] 

机构地区:[1]江南大学物联网工程学院,江苏无锡214112

出  处:《传感器与微系统》2017年第1期131-134,共4页Transducer and Microsystem Technologies

摘  要:针对C4.5算法繁多的对数运算、无关属性地干扰和属性相关性的影响等问题,提出了基于属性依赖度计算和主成分分析(PCA)的C4.5算法。根据等价无穷小的原理简化运算公式;用属性依赖度的计算并借鉴PCA算法的压缩原理来处理属性相关性问题;引入了"平均波动率"和"应用权重"两个新概念,得到一个新的属性选择度量。用学生综合成绩的评定工作进行应用分析,并用UCI数据集进行性能比较,实验表明:改进后算法的评定结果更科学,并且分类更准确,运算效率更高。Aiming at problem of C4.5 algorithm such as huge amount of logarithm operations, irrelevant attributes interference and attribute correlation effect, propose C4. 5 algorithm based on attribute dependency calculation and PCA. There are some enhancement strategies which includes simplified calculation formula according to principle of equivalent infinitesimal, deal with problem of attribute correlation through calculation of the dependency for attribute and reference the compression principle of principal component analysis(PCA) algorithm. While introduce two new concepts," average volatility" and" application weight" to get a new metric of attribute selection. With the evaluation work of the students' comprehensive performance for application analysis, and use UCI data sets to compare performance. Experimental results show that the improved algorithm evaluation results are more scientific, more accurate and higher computing efficiency than before.

关 键 词:C4.5算法 属性依赖度计算 主成分分析 平均波动率 应用权重 

分 类 号:TP311.5[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象