检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:郭光根 何蕊[1] 张玉军 GUO Guanggen;HE Rui;ZHANG Yujun
机构地区:[1]红云红河烟草(集团)有限责任公司昆明卷烟厂,昆明650000
出 处:《科技创新与应用》2024年第35期39-43,共5页Technology Innovation and Application
摘 要:由于烟草物流行业在运营过程中涉及的数据来源极其广泛且多样,数据不仅格式各异、结构复杂,而且往往分散存储在不同的信息系统中,导致物流数据在集成的过程中,出现数据吞吐量较低的现象。针对上述现象,提出基于K-medoids聚类的异构环境多源烟草物流数据集成方法。通过欠采样平衡类别分布,利用数据相关性和阈值清洗剔除冗余信息,提高异构环境多源烟草物流数据质量,设计基于K-medoids聚类的烟草物流数据集成框架,使用迁移学习动态调整源域权重以优化目标域聚类性能,引入带有相似性约束的新数据点作为初始聚类中心,实现异构环境多源烟草物流数据的有效集成。实验结果表明,设计方法通过聚类算法能够将来自不同数据源的数据进行有效分组和整合,降低数据处理的复杂性,提高数据集成的吞吐量。Due to the extremely wide and diverse data sources involved in the operation process of the tobacco logistics industry,the data not only has different formats and complex structures,but is also often scattered and stored in different information systems,resulting in data throughput during the integration process of logistics data Low phenomenon.Aiming at the above phenomena,a multi-source tobacco logistics data integration method based on K-medoids clustering in heterogeneous environments is proposed.By undersampling to balance category distribution,using data correlation and threshold cleaning to eliminate redundant information,we improve the quality of multi-source tobacco logistics data in heterogeneous environments.A tobacco logistics data integration framework based on K-medoids clustering is designed,and transfer learning is used to dynamically adjust source domain weights to optimize target domain clustering performance.New data points with similarity constraints are introduced as the initial clustering center to achieve effective integration of multi-source tobacco logistics data in heterogeneous environments.Experimental results show that the design method can effectively group and integrate data from different data sources through clustering algorithm,reducing the complexity of data processing and improving the throughput of data integration.
关 键 词:K-medoids聚类 异构环境 多源数据 烟草物流数据 数据集成方法
分 类 号:TP311.1[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7