基于K-medoids-NCA-SMOTE-BSVM融合模型的网络交易平台高质量数据资源识别研究  

High-quality Data Resource Identification of Network Trading Platform Based on K-medoids-NCA-SMOTE-BSVM Model

在线阅读下载全文

作  者:倪渊[1,2] 李思远 徐磊 张健 房津玉[1] NI Yuan;LI Siyuan;XU Lei;ZHANG Jian;FANG Jinyu(School of Economics and Management,Beijing Information Science and Technology University,Beijing 100192,China;Beijing Key Laboratory of Green Development Big Data Decision,Beijing 100192,China)

机构地区:[1]北京信息科技大学经济管理学院,北京100192 [2]绿色发展大数据决策北京市重点实验室,北京100192

出  处:《运筹与管理》2023年第11期87-93,I0040,I0041,共9页Operations Research and Management Science

基  金:国家重点研发计划项目(2017YFB1400400)。

摘  要:随着数据服务形态不断衍生,数据资源作为一种新兴生产要素,其交易流通需求呈现爆发式增长。如何从海量数据中识别高质量数据资源,挖掘要素价值,成为数据交易平台获取竞争优势以及提升要素配置效率的关键。本文旨在发现平台交易情境下高质量数据形成的关键因素,提出从大规模、异质数据资源中高效识别高质量数据的方法。首先,基于高质量数据形成过程,构建“固有品质-商品表征”二维识别指标体系;然后,提出K-medoids-NCA-SMOTE-BSVM融合模型,对高、中、低三类不同质量数据进行分类预测;最后,收集真实数据交易平台的API交易数据,开展实证研究。结果显示:相比SVM,WOA-SVM,PSO-SVM,MLP和CNN等方法,K-medoids-NCA-SMOTE-BSVM模型在预测准确率和训练时间方面,均有良好的性能表现。本文提出的识别指标及分类模型,为平台经济下数据质量判断与预测提供了依据,对产品视角下数据质量标准制定以及数据交易定价优化具有一定实践意义。As an emerging factor of production,the demand for trading and circulation of data resources has shown explosive growth.The problem of data quality has sparked widespread concern along with the exponential growth of data scale,and a lot of low-quality data is flooding into different types of data resource trade platforms.How to identify high-quality data resources in the massive resources has become the key for data trading platforms to gain competitive advantages and improve the efficiency of factor allocation.Existing research has provided a basis for high-quality data identification in the platform trading context,but there are still two deficiencies:Firstly,it is challenging to meet the requirements for large-scale data resources’quality identification because the existing identification methods,which are only applicable to the quality evaluation of small-scale and homogeneous data resources,have more manual participation components and insufficient automation.Secondly,the existing identification methods ignore the problem of uneven distribution of data resources of different quality,which easily triggers the bias of classification results and is difficult to meet the robustness requirements of heterogeneous sample classification.This paper is to clarify the mechanism of high-quality data resource formation in the context of platform transactions,discover the key factors for high-quality data resource formation in the context of platform transactions,and propose a method for efficiently recognizing high-quality data from large-scale and heterogeneous data resources.Data circulation and transaction are necessary for the realization of the value of data resources,and in a platform economy,data circulation is in the form of an open market with numerous participants.This paper studies the flow of data resources and the generation process of high-quality data within the platform environment and builds a high-quality data resource identification index system of“intrinsic quality-commodity characterization”.Af

关 键 词:数据交易平台 高质量数据 K-medoids-NCA-SMOTE-BSVM 多模型集成 

分 类 号:F724.6[经济管理—产业经济]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象