检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:祝鹏 郭艳光 ZHU Peng;GUO Yanguang(Department of Computer Technology and Information Management,Inner Mongolia Agricultural University,Baotou 014109,Inner Mongolia Autonomous Region,China)
机构地区:[1]内蒙古农业大学计算机技术与信息管理系,内蒙古包头014109
出 处:《吉林大学学报(理学版)》2023年第3期665-670,共6页Journal of Jilin University:Science Edition
基 金:内蒙古自治区科技重大专项课题项目(批准号:2021SZD0012-1);内蒙古自治区科技计划项目(批准号:2020GG0033);内蒙古自治区高等学校科学研究项目(批准号:NJZY20055);内蒙古自治区哲学社会科学规划项目(批准号:2020NDC067).
摘 要:针对因多源信息数据源域相似性较低、不易确定导致的集成难度较大问题,提出一种基于K-medoids聚类算法的集成方法.先将多源数据的聚类过程视为迁移学习过程,确定初始样本的权重值,记录训练样本每次迭代时权重和损失期望值的学习特点,再利用特点参数判定数据属于源域还是目标域;然后将集成算法聚类转化为多样化的域值标记问题,使数据具有聚类特性后,再分别计算源域和目标域中待集成数据间的权重因子,利用权重因子覆盖特性判定二者间的交互信息量,对信息量较高的数据进行集成,以确保集成的成功率.仿真实验结果表明,该算法无论是在稳定、数目较少的数据集,还是在紊乱、数目较多较杂的数据集下,都能实现高效集成,并且二次集成次数较少,整体耗用较低.Aiming at the problem that the integration difficulty was relatively high caused by the low similarity and uncertainty of multi-source information data source domain,we proposed an integration method based on K-medoids clustering algorithm.First,the clustering process of multi-source data was regarded as a transfer learning process,the weight value of the initial sample was determined,the learning characteristics of the weight and loss expectation value of the training sample in each iteration were recorded,and then the characteristic parameters were used to determine whether the data belongs to the source domain or the target domain.Then the clustering of the integration algorithm was transformed into a diversified domain value marking problem.After the data had the clustering characteristics,the weight factors between the data to be integrated in the source domain and the target domain were calculated respectively,the amount of interactive information between them was determined by using the coverage characteristics of the weight factors,and the data with high amount of information was integrated to ensure the success rate of integration.The simulation experiment results show that the proposed algorithm can achieve efficient integration,less secondary integration times and low overall consumption under stable and less datasets,or disordered and more and more complex datasets.
关 键 词:K-medoids聚类算法 多源数据 源域 目标域 交互信息量
分 类 号:TP393.09[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.133.129.118