检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:任胜兵[1] 廖湘荡 REN Sheng-bing;LIAO Xiang-dang(School of Software,Central South University,Changsha 410075,China)
出 处:《计算机工程与科学》2018年第10期1787-1795,共9页Computer Engineering & Science
摘 要:软件缺陷预测是典型的非平衡学习问题。基于CS-SVM和聚类算法改进代价敏感支持向量机(SVM)算法,提出了CCS-SVM软件缺陷预测模型。在CCS-SVM预测模型中,将SVM与类别误分代价结合起来,以非平衡数据评价指标作为目标函数,优化错分代价因子,提升少数类样本的识别率。通过聚类找到每类样本的中心点,根据样本到其中心点的距离定义每个样本的类别置信度,给每个样本分配不同的误分代价系数,并把样本的置信度引入到代价敏感SVM优化问题中,提高算法鲁棒性,提升SVM分类性能。此外,为了提高模型的泛化能力,使用遗传算法优化特征选择和模型参数。通过美国航空航天局NASA MDP数据集实验表明,本文方法的G-mean和F-measure模型评价值有明显的提升。Software defect prediction is a typical unbalanced learning problem.We propose a CCS-SVM software defect prediction model based on cost sensitive SVM algorithm improved by the CS-SVM and clustering algorithm.In the CCS-SVM prediction model,we combine SVM and the cost of class misclassification,take unbalanced data evaluation index as the objective function,and optimize the misclassification cost factor so as to enhance the recognition rate of the minority class samples.We find the center point of each sample through clustering,define the class confidence for each sample according to the distance of the sample to its center point,assign different misclassification cost factors to different samples,and introduce the class confidence of each sample to the optimization problem of cost sensitive SVM,and improve the robustness of the algorithm and classification performance of SVM.To enhance the generalization ability of the model,we use the genetic algorithm to optimize feature selection and model parameters.Experimental results of the NASA Metric Data Program(MDP)dataset show that our method is significantly improved in the G-mean and F-measure value for model evaluation.
关 键 词:软件缺陷预测 代价敏感 支持向量机 非平衡数据分类 参数选择 遗传算法
分 类 号:TP311.5[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.144.250.2