Empirically revisiting and enhancing automatic classification of bug and non-bug issues  

在线阅读下载全文

作  者:Zhong LI Minxue PAN Yu PEI Tian ZHANG Linzhang WANG Xuandong LI 

机构地区:[1]State Key Laboratory for Novel Software Technology,Nanjing University,Nanjing 210023,China [2]Department of Computer Science and Technology,Nanjing University,Nanjing 210023,China [3]Software Institute,Nanjing University,Nanjing 210093,China [4]Department of Computing,The Hong Kong Polytechnic University,Hong Kong,China

出  处:《Frontiers of Computer Science》2024年第5期25-44,共20页计算机科学前沿(英文版)

基  金:This research was supported by the National Natural Science Foundation of China(Grant No.61972193);the Program B for Outstanding PhD Candidate of Nanjing University。

摘  要:A large body of research effort has been dedicated to automated issue classification for Issue Tracking Systems(ITSs).Although the existing approaches have shown promising performance,the different design choices,including the different textual fields,feature representation methods and machine learning algorithms adopted by existing approaches,have not been comprehensively compared and analyzed.To fill this gap,we perform the first extensive study of automated issue classification on 9 state-of-the-art issue classification approaches.Our experimental results on the widely studied dataset reveal multiple practical guidelines for automated issue classification,including:(1)Training separate models for the issue titles and descriptions and then combining these two models tend to achieve better performance for issue classification;(2)Word embedding with Long Short-Term Memory(LSTM)can better extract features from the textual fields in the issues,and hence,lead to better issue classification models;(3)There exist certain terms in the textual fields that are helpful for building more discriminating classifiers between bug and non-bug issues;(4)The performance of the issue classification model is not sensitive to the choices of ML algorithms.Based on our study outcomes,we further propose an advanced issue classification approach,DEEPLABEL,which can achieve better performance compared with the existing issue classification approaches.

关 键 词:issue tracking issue type prediction empirical study 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象