检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:许行[1] 马帅 温萧轲 李雨轩 XU Hang;MA Shuai;WEN Xiaoke;LI Yuxuan(School of Computer and Information Technology,Shanxi University,Taiyuan 030006,China)
机构地区:[1]山西大学计算机与信息技术学院,山西太原030006
出 处:《山西大学学报(自然科学版)》2023年第4期811-820,共10页Journal of Shanxi University(Natural Science Edition)
基 金:国家自然科学基金(62206161);山西省高等学校科技创新项目(2020L0026)。
摘 要:由于有序与无序特征之间的复杂关系,现有分类方法不能有效处理混合数据(同时包括有序和无序特征)上的分类问题。针对此问题,提出了基于k近邻的混合数据分类方法(a classification method for mixed data based on k-nearest neighbor,MDKNN)。首先通过区分有序和无序特征计算样本之间的距离,获取特征的序信息和统计信息;然后分别从优于和劣于预测样本的训练集中选出最近邻样本,并基于模糊关系计算其类隶属度,以确定预测样本的类标签范围,从而保证预测结果的单调性;最后在该范围内计算分类结果。在来自UCI和WEKA的12个公开数据集上进行实验,分别与基于k近邻模型的MKNN、FKNN、MFKNN算法和基于非k近邻模型的PMDT、OLM、OSDL算法比较,所提方法都获得了最高的平均准确率,且分别比两类模型中的最优算法MFKNN和PMDT提高了7.13%和9.84%,表明了所提方法的有效性。Due to the complex relationship between ordered and unordered features,existing classification methods are not effective in dealing with classification problems on mixed data(including both ordered and unordered features).To address this problem,a classification method for mixed data based on k-Nearest Neighbor(MDKNN)is proposed to improve the classification performance on mixed data.Firstly,by distinguishing between ordered and unordered features to calculate the distance between samples,the order information and statistical information of the features are obtained;Then,the nearest neighbors are selected from the training samples that are superior and inferior to the predicted samples,and their membership degrees for each class are calculated based on fuzzy relationships to determine the range of class labels for the predicted samples,ensuring the monotonicity of the prediction results.Finally,the classification result is calculated within this range.Experiments were conducted on 12 public datasets from UCI and WEKA.The proposed method achieved the highest average accuracy when compared to the k-nearest neighbor model-based algorithms MKNN,FKNN,and MFKNN,as well as the non-k-nearest neighbor model-based algorithms PMDT,OLM,and OSDL.Compared to the best algorithms in the comparative methods of each category,MFKNN and PMDT,the proposed method improved the accuracy by 7.13%and 9.84%,respectively,demonstrating the effectiveness of the proposed method.
分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15