基于K-medoids聚类算法的多源信息数据集成算法被引量：10

Multi-source Information Data Integration Algorithm Based on K-Medoids Clustering Algorithm

作　　者：祝鹏郭艳光 ZHU Peng;GUO Yanguang(Department of Computer Technology and Information Management,Inner Mongolia Agricultural University,Baotou 014109,Inner Mongolia Autonomous Region,China)

机构地区：[1]内蒙古农业大学计算机技术与信息管理系,内蒙古包头014109

出　　处：《吉林大学学报（理学版）》2023年第3期665-670,共6页Journal of Jilin University:Science Edition

基　　金：内蒙古自治区科技重大专项课题项目(批准号:2021SZD0012-1);内蒙古自治区科技计划项目(批准号:2020GG0033);内蒙古自治区高等学校科学研究项目(批准号:NJZY20055);内蒙古自治区哲学社会科学规划项目(批准号:2020NDC067).

摘　　要：针对因多源信息数据源域相似性较低、不易确定导致的集成难度较大问题,提出一种基于K-medoids聚类算法的集成方法.先将多源数据的聚类过程视为迁移学习过程,确定初始样本的权重值,记录训练样本每次迭代时权重和损失期望值的学习特点,再利用特点参数判定数据属于源域还是目标域;然后将集成算法聚类转化为多样化的域值标记问题,使数据具有聚类特性后,再分别计算源域和目标域中待集成数据间的权重因子,利用权重因子覆盖特性判定二者间的交互信息量,对信息量较高的数据进行集成,以确保集成的成功率.仿真实验结果表明,该算法无论是在稳定、数目较少的数据集,还是在紊乱、数目较多较杂的数据集下,都能实现高效集成,并且二次集成次数较少,整体耗用较低.Aiming at the problem that the integration difficulty was relatively high caused by the low similarity and uncertainty of multi-source information data source domain,we proposed an integration method based on K-medoids clustering algorithm.First,the clustering process of multi-source data was regarded as a transfer learning process,the weight value of the initial sample was determined,the learning characteristics of the weight and loss expectation value of the training sample in each iteration were recorded,and then the characteristic parameters were used to determine whether the data belongs to the source domain or the target domain.Then the clustering of the integration algorithm was transformed into a diversified domain value marking problem.After the data had the clustering characteristics,the weight factors between the data to be integrated in the source domain and the target domain were calculated respectively,the amount of interactive information between them was determined by using the coverage characteristics of the weight factors,and the data with high amount of information was integrated to ensure the success rate of integration.The simulation experiment results show that the proposed algorithm can achieve efficient integration,less secondary integration times and low overall consumption under stable and less datasets,or disordered and more and more complex datasets.

关键词：K-medoids聚类算法多源数据源域目标域交互信息量

分类号：TP393.09[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于K-medoids聚类算法的多源信息数据集成算法被引量：10

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于K-medoids聚类算法的多源信息数据集成算法 被引量：10

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于K-medoids聚类算法的多源信息数据集成算法被引量：10