基于代价敏感支持向量机的软件缺陷预测研究  被引量:7

Software defect prediction based on cost-sensitive support vector machine

在线阅读下载全文

作  者:任胜兵[1] 廖湘荡 REN Sheng-bing;LIAO Xiang-dang(School of Software,Central South University,Changsha 410075,China)

机构地区:[1]中南大学软件学院,湖南长沙410075

出  处:《计算机工程与科学》2018年第10期1787-1795,共9页Computer Engineering & Science

摘  要:软件缺陷预测是典型的非平衡学习问题。基于CS-SVM和聚类算法改进代价敏感支持向量机(SVM)算法,提出了CCS-SVM软件缺陷预测模型。在CCS-SVM预测模型中,将SVM与类别误分代价结合起来,以非平衡数据评价指标作为目标函数,优化错分代价因子,提升少数类样本的识别率。通过聚类找到每类样本的中心点,根据样本到其中心点的距离定义每个样本的类别置信度,给每个样本分配不同的误分代价系数,并把样本的置信度引入到代价敏感SVM优化问题中,提高算法鲁棒性,提升SVM分类性能。此外,为了提高模型的泛化能力,使用遗传算法优化特征选择和模型参数。通过美国航空航天局NASA MDP数据集实验表明,本文方法的G-mean和F-measure模型评价值有明显的提升。Software defect prediction is a typical unbalanced learning problem.We propose a CCS-SVM software defect prediction model based on cost sensitive SVM algorithm improved by the CS-SVM and clustering algorithm.In the CCS-SVM prediction model,we combine SVM and the cost of class misclassification,take unbalanced data evaluation index as the objective function,and optimize the misclassification cost factor so as to enhance the recognition rate of the minority class samples.We find the center point of each sample through clustering,define the class confidence for each sample according to the distance of the sample to its center point,assign different misclassification cost factors to different samples,and introduce the class confidence of each sample to the optimization problem of cost sensitive SVM,and improve the robustness of the algorithm and classification performance of SVM.To enhance the generalization ability of the model,we use the genetic algorithm to optimize feature selection and model parameters.Experimental results of the NASA Metric Data Program(MDP)dataset show that our method is significantly improved in the G-mean and F-measure value for model evaluation.

关 键 词:软件缺陷预测 代价敏感 支持向量机 非平衡数据分类 参数选择 遗传算法 

分 类 号:TP311.5[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象