Detecting Duplicate Contributions in Pull-Based Model CombiningTextual and Change Similarities  

在线阅读下载全文

作  者:Zhi-Xing Li Yue Yu Tao Wang Gang Yin Xin-Jun Mao Huai-Min Wang 

机构地区:[1]Key Laboratory of Parallel and Distributed Computing,College of Computer,National University of Defense Technology Changsha 410073,China [2]Laboratory of Software Engineering for Complex Systems,College of Computer,National University of Defense Technology,Changsha 410073,China

出  处:《Journal of Computer Science & Technology》2021年第1期191-206,共16页计算机科学技术学报(英文版)

基  金:This work was supported by the National Key Research and Development Program of China under Grant No. 2018YFB1004202;the National Natural Science Foundation of China under Grant No. 61702534.

摘  要:Communication and coordination between OSS developers who do not work physically in the same location have always been the challenging issues.The pull-based development model,as the state-of-art collaborative development mechanism,provides high openness and transparency to improve the visibility of contributors'work.However,duplicate contributions may still be submitted by more than one contributors to solve the same problem due to the parallel and uncoordinated nature of this model.If not detected in time,duplicate pull-requests can cause contributors and reviewers to waste time and energy on redundant work.In this paper,we propose an approach combining textual and change similarities to automatically detect duplicate contributions in pull-based model at submission time.For a new-arriving contribution,we first compute textual similarity and change similarity between it and other existing contributions.And then our method returns a list of candidate duplicate contributions that are most similar with the new contribution in terms of the combined textual and change similarity.The evaluation shows that 83.4%of the duplicates can be found in average when we use the combined textual and change similarity compared to 54.8%using only textual similarity and 78.2%using only change similarity.

关 键 词:Pull-request Duplicate detection textual similarity change similarity 

分 类 号:TP39[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象