面向混合特征数据的粒子群填补方法  被引量:1

Particle swarm optimization based data imputation method for mixed features

在线阅读下载全文

作  者:刘艺 秦伟 李庚松 刘坤 王强 郑奇斌 任小广 LIU Yi;QIN Wei;LI Gengsong;LIU Kun;WANG Qiang;ZHENG Qibin;REN Xiaoguang(Academy of Military Sciences,Beijing 100091,China)

机构地区:[1]军事科学院,北京100091

出  处:《国防科技大学学报》2024年第6期107-112,共6页Journal of National University of Defense Technology

基  金:国家自然科学基金资助项目(91948303);国家自然科学基金青年科学基金资助项目(61802426)。

摘  要:针对传统数据填补方法难以有效利用标签信息和缺失数据的随机信息的不足,提出面向混合型特征的粒子群优化填补算法。将连续型特征取值建模为高斯分布,均值和标准差作为优化参数。将离散型特征的取值概率作为参数进行优化。使用分类正确率作为优化目标,充分利用标签信息和缺失数据的随机信息。采用4种基于统计的方法和2种基于演化算法的填补方法作为对比,在6个典型的分类数据集上进行实验。结果表明,提出的方法在分类正确率指标上显著优于其他对比算法,同时具有较优的时间开销,能够有效解决混合特征数据缺失的问题。Aiming at the deficiency of traditional data imputation methods in effectively using the label information and random characteristics of missing data,a particle swarm optimization based imputation method for mixed features was proposed.The value of continuous feature was modeled as Gaussian distribution,and the mean and standard deviation were used as optimization parameters.The value probability of categorical features was optimized as a parameter.The classification accuracy rate was used as the optimization target to make full use of random information of label information and missing data.Four statistical methods and two evolutionary algorithm based imputation methods were used to compare the results on six typical classification datasets.The results show that the proposed method significantly outperforms other comparison algorithms in terms of classification accuracy indicator,and has better time overhead at the same time,which can effectively solve the data missing problems of mixed features.

关 键 词:缺失数据 数据填补 粒子群优化 混合特征 分类 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象