检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:刘佳宇 周凌云 吴秋峰[4] 孟翔燕[4] 邓华玲[4] LIU Jia-yu;ZHOU Ling-yun;WU Qiu-feng;MENG Xiang-yan;DENG Hua-ling(College of Economics and Management,Northeast Agricultural University,Harbin 150030,China;College of Economics,Heilongjiang University of Finance and Economic,Harbin 150030,China;College of Engineering,Northeast Agricultural University,Harbin 150030,China;College of Science,Northeast Agricultural University,Harbin 150030,China)
机构地区:[1]东北农业大学经济管理学院,黑龙江哈尔滨150030 [2]黑龙江财经学院经济系,黑龙江哈尔滨150030 [3]东北农业大学工程学院,黑龙江哈尔滨150030 [4]东北农业大学理学院,黑龙江哈尔滨150030
出 处:《数学的实践与认识》2020年第16期132-143,共12页Mathematics in Practice and Theory
基 金:公益性行业(农业)科研专项项目二级任务(201503116-04-06);黑龙江省博士后基金(LBHZ15020);国家科技支撑计划专题任务(2014BAD12B01-1-3);哈尔滨市科技创新人才研究专项资金(青年后备人才)(2017RAQXJ096);半湿润区粳稻水分高效利用技术集成与示范(2018YFD0300105-2)。
摘 要:对于传统K近邻算法只适用于数值属性数据类型的问题,提出了一种基于对混合属性数据中的不同属性列赋予不同权值的K近邻算法(K Nearest Neighbor for Mixed-attribute Data,KNNM),使新的K近邻算法能够适用于混合属性数据.由于混合数据间数值属性部分与分类属性部分对整体相似性度量的贡献率不同,又各分量对其所属的属性部分的相似性度量的贡献率不同的特点.提出了考虑数值属性部分与分类属性部分作为整体对混合属性数据间的相似性度量的贡献率,并考虑不同属性数据的各分量对其所属的数据间的相似性度量的贡献率的向量参数计算方法,以此提出了一种适用于混合属性数据的K近邻方法.在5个UCI数据集上的实验结果表明KNNM算法在准确率,宏平均召回率,宏平均精度、宏平均值和ROC均优于传统K近邻算法,以此说明KNNM方法在混合属性数据上的适用性与有效性.According to the problem of traditional k-Nearest Neighbor(KNN) algorithm that it’s only applicable to numerical data,this paper proposes a novel KNN algorithm based on assign different weights to different attribute columns between mixed attribute data(K Nearest Neighbor for Mixed-attribute Data,KNNM),which is suitable for mixed attribute data.As part of numerical data and part of category data in mixed attribute data make different contributions to the whole similarity measure,and the contribution of each component to the similarity measure of the attribute part to which it belongs is different.This paper proposes a computing vectors-based parameters method,which considers two contributions of part of numerical data and part of category data in mixed attribute data as a whole respectively to the whole similarity measure,and consider the contribution of each component to the data to which it belongs.Based this view,this paper presents the vector-based KNNM,which is suitable for mixed attribute data.The experimental results on five UCI datasets show that KNNM is superior to KNN in views of accuracy,macro average recall,macro average precision,macro average F1 measure and ROC,that is,KNNM algorithm is suitable and effective for mixed attribute data.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.222.24.23