检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:黄晨峻 高建华[1] HUANG Chenjun;GAO Jianhua(Department of Computer Science and Technology,Shanghai Normal University,Shanghai 200234,China)
机构地区:[1]上海师范大学计算机科学与技术系,上海200234
出 处:《小型微型计算机系统》2025年第2期504-512,共9页Journal of Chinese Computer Systems
基 金:国家自然科学基金项目(61672355)资助。
摘 要:代码异味会导致软件质量逐渐衰退,降低软件可理解性和可维护性.为检测软件结构中的代码异味,提出了一种基于CK度量的、经过两步特征选择的软投票集成学习的代码异味检测方法,该方法首先进行特征选择,使用Pearson相关系数剔除冗余特征,并在剩余度量中使用XGBoost特征重要性筛选相关度大的度量.然后,针对仅使用单一机器学习模型泛化性能不佳的问题,提出一种基于5种较成熟机器学习模型的软投票集成学习模型,完成代码异味分类检测任务.实验基于CK度量,利用含7个开源项目、4种代码异味的数据集,实验结果表明,此种方法能够减少特征维度,且在性能指标上优于其它分类模型,其中F1值最高提升3.24%,AUC最高提升2.32%.Code smells can lead to the gradual deterioration of software quality and reduce the understandability and maintainability.To detect code smells in software structure,it is proposed a method based on CK metrics and two-step feature selection soft voting ensemble learning in this paper.Firstly,Pearson correlation coefficient was used to remove redundant attributes,and XGBoost feature importance was used to select the attributes with high correlation in the remaining attributes.Then,in order to solve the problem of poor generalization performance using only one single machine learning model,a soft voting ensemble learning model based on five mature machine learning models was proposed to complete the code smells classification detection task.The experiment is based on CK metrics,the data set containing 7 open source projects and 4 types of code odor is used.The results show that the proposed method can reduce the characteristic dimension and is superior to other classification models in terms of performance index,in which F1 value and AUC value increase by 3.24%and 2.32%respectively.
关 键 词:代码异味 特征选择 CK度量 投票模型 集成学习
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222