基于熵降噪优化相似性距离的KNN算法研究  被引量:4

ON KNN ALGORITHM BASED ON OPTIMISING SIMILARITY DISTANCE WITH ENTROPY NOISE REDUCTION

在线阅读下载全文

作  者:刘晋胜[1] 

机构地区:[1]广东石油化工学院计算机与电子信息学院,广东茂名525000

出  处:《计算机应用与软件》2015年第9期254-256,285,共4页Computer Applications and Software

基  金:广东省教育部产学研结合项目(2011A090200088)

摘  要:围绕KNN算法,以寻求高精度、高效率的相似性距离度量方法为主要研究目的。根据特征参数熵变换指标的类别特点,提出一种运用熵特征变换指标设计相互类别差异量的相似性距离度量,以降低特征参数类别噪音。对熵降噪优化、熵相关度差异、类可信度计算、传统欧式距离及相同特征参数几种KNN算法进行理论分析、Letter和Pima Indians Diabetes数据集仿真实验及KDD CUP'99的实际应用,均显示该算法在KNN算法中具有很好的效果。The main purpose of the research in this paper is to find an approach of similarity distance metric with high precision and high efficiency around the KNN algorithm. In this paper, according to the class features of characteristic parameter entropy transform indicator we proposed a similarity distance metric algorithm for reducing the noise of characteristic parameters class, which uses entropy characteristic transform indicator to design the amount of mutual class difference. For noise reduction optimisation of entropy, entropy correlation difference, class credibility calculation, traditional Euclidean distance and several KNN algorithms with same characteristic parameters, the theoretical analysis, simulation experiments on dataset of Letter and Pima Indians Diabetes, as well as KDD CUP' 99 practical application of this similarity distance metric all show that the new algorithm is quite effective in KNN.

关 键 词:K近邻分类 熵特征变换 降噪 相似性距离 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象