LightGBM融合CFS的开发者感知代码异味强度预测模型研究  被引量:2

Research on Developer Perceived Code Smell Intensity Prediction Model Based on LightGBM and CFS

在线阅读下载全文

作  者:宇通 高建华[1] YU Tong;GAO Jian-hua(Department of Computer Science and Technology,Shanghai Normal University,Shanghai 200234,China)

机构地区:[1]上海师范大学计算机科学与技术系,上海200234

出  处:《小型微型计算机系统》2022年第12期2667-2674,共8页Journal of Chinese Computer Systems

基  金:国家自然科学基金项目(61672355)资助。

摘  要:准确地对代码异味强度进行预测可使高危险性的代码问题得到优先处理,从而减少软件项目的维护开销.目前针对异味强度的研究较少且基于传统手工和单一算法的异味强度识别方法不能保证检测的精确性与效率.对此,本文提出一种基于LightGBM融合CFS的开发者感知代码异味强度预测模型,该模型利用经相关性特征选择后的代码度量指标,考虑基于开发者感知的代码异味严重性,使用LightGBM算法,对含4种代码异味的实例进行异味强度预测并划分强度等级.本文从统计角度验证了所考虑的各项代码度量指标与异味严重性之间存在强相关关系.实验表明,本文模型在精确率、召回率、F1值、MCC和AUC等多项指标上均优于原有性能最佳的随机森林(RF)模型,其中F1值最高达90.0%,最多提升3.7%;AUC值最高达94.2%,最多提升3.8%;且相比RF模型预测时间可缩短76.1%.Accurate prediction of code smell intensity can give priority to high-risk code problems,so as to reduce the maintenance cost of software projects.At present,there are few researches on smell intensity,and the smell intensity recognition method based on traditional manual and single algorithm can not ensure the accuracy and efficiency of detection.To solve this problem,this paper proposes a developer perceived code smell intensity prediction model based on LightGBM and CFS.The model utilizes the code metrics after correlation feature selection,considers the severity of code smell perceived by developers,and uses LightGBM algorithm to predict the smell intensity of four code smells instances and classify the intensity grade.From the statistical point of view,this paper verifies that there is a strong correlation between the considered code metrics and the severity of smell.Experiments show that the proposed model is superior to the original random forest(RF)model with the best performance in many indexes such as accuracy,recall,F1 value,MCC and AUC.The F1 value is up to 90.0%and increases by 3.7%;The AUC value was up to 94.2%and increases by 3.8%;Compared with RF model,the prediction time can be shortened by 76.1%.

关 键 词:代码异味 LightGBM CFS 强度预测 机器学习 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象