基于代价敏感集成分类器的长方法检测  被引量:3

Long Method Detection Based on Cost-sensitive Integrated Classifier

在线阅读下载全文

作  者:刘丽倩 董东[1] LIU Li-qian;DONG Dong(College of Mathematics and Information Science,Hebei Normal University,Shijiazhuang 050024,China)

机构地区:[1]河北师范大学数学与信息科学学院,石家庄050024

出  处:《计算机科学》2018年第B11期497-500,共4页Computer Science

摘  要:长方法(Long Method)是由于一个方法太长而需要重构的软件设计的问题。为了提高传统机器学习方法对长方法的识别率,针对代码坏味数据不平衡的特性,提出代价敏感集成分类器算法。以传统决策树算法为基础,利用欠采样策略对样本进行重采样,进而生成多个平衡的子集,并将这些子集训练生成多个相同的基分类器,然后将这些基分类器组合形成一个集成分类器。最后在集成分类器中引入由认知复杂度决定的误分类代价,使得分类器向准确分类少数类倾斜。与传统机器学习算法相比,此方法对长方法检测结果的查准率和查全率均有一定提升。Long method is a software design problem that requires refactoring because it is too long.In order to improve the detection rate of traditional machine learning approaches on long method,a cost-sensitive integrated classifier algorithm was proposed from the viewpoint of unbalanced sample data of code smell.Based on the traditional decision tree algorithm,the under-sampling startegy is used for resampling,then a plurality of balanced subsets are generated.These subsets are trained to generate a plurality of same base classifiers.Finally,the mistaken classification cost determined by the cognitive complexity is complemented to the integrated classifier.The cost makes the classifier inclined to the accuracy rate of the minority categories.Compared with the traditional machine learning algorithm,this method has improved the precision and recall for detection result of long methods.

关 键 词:长方法 代码坏味 代价敏感 认知复杂度 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象