检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:刘丽倩 董东[1] LIU Li-qian;DONG Dong(College of Mathematics and Information Science,Hebei Normal University,Shijiazhuang 050024,China)
机构地区:[1]河北师范大学数学与信息科学学院,石家庄050024
出 处:《计算机科学》2018年第B11期497-500,共4页Computer Science
摘 要:长方法(Long Method)是由于一个方法太长而需要重构的软件设计的问题。为了提高传统机器学习方法对长方法的识别率,针对代码坏味数据不平衡的特性,提出代价敏感集成分类器算法。以传统决策树算法为基础,利用欠采样策略对样本进行重采样,进而生成多个平衡的子集,并将这些子集训练生成多个相同的基分类器,然后将这些基分类器组合形成一个集成分类器。最后在集成分类器中引入由认知复杂度决定的误分类代价,使得分类器向准确分类少数类倾斜。与传统机器学习算法相比,此方法对长方法检测结果的查准率和查全率均有一定提升。Long method is a software design problem that requires refactoring because it is too long.In order to improve the detection rate of traditional machine learning approaches on long method,a cost-sensitive integrated classifier algorithm was proposed from the viewpoint of unbalanced sample data of code smell.Based on the traditional decision tree algorithm,the under-sampling startegy is used for resampling,then a plurality of balanced subsets are generated.These subsets are trained to generate a plurality of same base classifiers.Finally,the mistaken classification cost determined by the cognitive complexity is complemented to the integrated classifier.The cost makes the classifier inclined to the accuracy rate of the minority categories.Compared with the traditional machine learning algorithm,this method has improved the precision and recall for detection result of long methods.
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.112