开源社区拉取请求与问题的链接建立方法  

A link establishment method between pull requests and issues in open source community

作  者:蒋竞 季陈虹 苗萌 张莉[1] Jing JIANG;Chenhong JI;Meng MIAO;Li ZHANG(State Key Laboratory of Complex&Critical Software Environment,Beihang University,Beijing 100191,China)

机构地区:[1]北京航空航天大学复杂关键软件环境全国重点实验室,北京100191

出  处:《中国科学:信息科学》2025年第3期559-581,共23页Scientia Sinica(Informationis)

基  金:科技创新2030—“新一代人工智能”重大项目(批准号:2021ZD0112901);国家自然科学基金(批准号:62177003);中央高校基本科研业务费专项资金(批准号:JKF-20240213)资助项目。

摘  要:在开源社区GitHub,开发者通过提交拉取请求向开源项目贡献代码.一些拉取请求可能与用户发布的问题存在链接关系,表明该拉取请求希望解决这些问题.维护拉取请求和问题之间的链接关系可以增强项目的可追溯性.然而,目前链接关系由用户人工建立.由于拉取请求和问题数量太多,人工建立链接不仅耗时费力,还可能存在遗漏.针对上述问题,本文提出一种链接建立方法 LinkFinder,通过模板过滤减少正文模板内容高度同质化的影响,从拉取请求和问题中提取语义特征和统计特征,使用深度神经网络构建匹配度计算拉取请求和问题的匹配度,得到链接问题推荐列表.本文从5个开源项目中收集了25411条链接.实验结果表明, LinkFinder的MAP达到0.434~0.774, MRR达到0.436~0.774, Top-1精确率达到0.344~0.702, Top-1召回率达到0.333~0.698, Top-1 F1分数达到0.338~0.700;相比对比方法 T-BERT, LinkFinder的MAP提升了9.01%~186.63%, MRR提升了8.86%~183.50%, Top-1精确率提升了20.21%~388.54%, Top-1召回率提升了20.34%~386.52%, Top-1F1分数提升了20.27%~389.13%.为了分析链接的价值,本文设计基于链接问题参与度的评审人推荐方法.实验结果表明,相比对比方法 RevFinder,引入候选评审人对链接问题的参与度排序对4个项目的MAP提升幅度达到3.11%~41.20%, MRR提升幅度达到2.45%~49.26%.In the GitHub open-source community,developers contribute code to projects by submitting pull requests(PRs).Some PRs are linked to issues reported by users,indicating an intent to address those issues.Maintaining these linking relationships is essential for enhancing project traceability and accountability.However,the current manual approach to established links is not only time-consuming and labor-intensive but also susceptible to omissions owing to the high volume of PRs and issues.To address these challenges,this study introduces LinkFinder,a method designed to establish links between PRs and issues efficiently.LinkFinder mitigates the impact of highly homogenized content in templates throughfiltering,extracts semantic and statistical features from PRs and issues,and utilizes a deep neural network to calculate the matching degree between PRs and issues;this process generates a recommended list of linked issues for each PR.We conducted experiments using 25411 links fromfive open-source projects.The results demonstrate that LinkFinder achieves a mean average precision(MAP)ranging from 0.434 to 0.774,a mean reciprocal rank(MRR)ranging from 0.436 to 0.774,a Top-1 Precision ranging from 0.344 to 0.702,a Top-1 Recall ranging from 0.333 to 0.698,and a Top-1 F1 score ranging from 0.338 to 0.700.Compared to the baseline method,T-BERT,LinkFinder improves MAP by 9.01%to 186.63%,MRR by 8.86%to 183.50%,Top-1 Precision by 20.21%to 388.54%,Top-1 Recall by 20.34%to 386.52%,and Top-1 F1 score by 20.27%to 389.13%.To evaluate the practical value of established links,this study further proposes a reviewer recommendation method based on linked issue participation.Experimental results show that incorporating reviewers’participation in linked issues improves MAP by 3.11%to 41.20%and MRR by 2.45%to 49.26%across four projects compared to the baseline method,RevFinder.

关 键 词:GitHub 拉取请求 问题 链接建立 评审人推荐 

分 类 号:TP3[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象