基于改动树检索的拉取请求描述生成方法  被引量:1

Description Generation Method for Pull Requests Based on Retrieval of Modification Tree

在线阅读下载全文

作  者:蒋竞 刘子豪 张莉[1] 汪亮 JIANG Jing;LIU Zi-Hao;ZHANG Li;WANG Liang(State Key Laboratory of Complex Critical Software Environment(Beihang University),Beijing 100191,China;State Key Laboratory for Novel Software Technology(Nanjing University),Nanjing 210023,China)

机构地区:[1]复杂关键软件环境全国重点实验室(北京航空航天大学),北京100191 [2]计算机软件新技术国家重点实验室(南京大学),江苏南京210023

出  处:《软件学报》2024年第11期5065-5082,共18页Journal of Software

基  金:科技创新2030—“新一代人工智能”重大项目(2021ZD0112901);国家自然科学基金(62177003,62172203)。

摘  要:随着开源人工智能系统规模的扩大,软件的开发与维护也变得困难.GitHub是开源社区最重要的开源项目托管平台之一,通过GitHub提供的拉取请求系统,开发者可以方便地参与到开源项目的开发.拉取请求的描述可以帮助项目核心团队理解拉取请求的内容和开发者的意图,促进拉取请求被接受.当前,存在可观比例的开发者没有为拉取请求提供描述,既增加了核心团队的工作负担,也不利于项目日后的维护工作.提出一种自动为拉取请求生成描述的方法PRSim.所提方法提取拉取请求包含的提交说明、注释更新和代码改动等特征,建立语法改动树,使用树结构自编码器编码以检索代码改动相似的其他拉取请求,参照相似拉取请求的描述,使用编码器-解码器网络概括提交说明和注释更新,生成新拉取请求的描述.实验结果表明,PRSim的生成效果在Rouge-1、Rouge-2和Rouge-L这3个指标的F1分数上分别达到36.47%、27.69%和35.37%,与现有方法LeadCM相比分别提升了34.3%、75.2%和55.3%,与方法Attn+PG+RL相比分别提升了16.2%、22.9%和16.8%,与方法PRHAN相比分别提升了23.5%、72.0%和24.8%.As the scale of open-source artificial intelligence(AI)systems expands,software development and maintenance become difficult.GitHub is one of the most important hosting platforms for open-source projects in the open-source community.Developers can easily participate in the development of open-source projects through pull request systems provided by GitHub.The description of pull requests can help the core teams of the project understand the content of the pull requests and the intention of the developers and promote the acceptance of the pull request.At present,a considerable proportion of developers do not provide a description for the pull request,which not only increases the workload of the core team but also is not conducive to the maintenance of the project in the future.This study proposes a method named PRSim to automatically generate descriptions for pull requests.This method extracts features including commit messages,comment updates,and code changes from pull requests,builds a syntax modification tree,and uses a tree-structured autoencoder to find other pull requests with similar code changes.Then,with the help of the description of a similar pull request,it summarizes commit messages and comment updates through an encoder-decoder network to generate the description of a new pull request.The experimental results show that the generation effect of PRSim reaches 36.47%,27.69%,and 35.37%in terms of the F1 score of metrics Rouge-1,Rouge-2,and Rouge-L,respectively,which is 34.3%,75.2%,and 55.3%higher than LeadCM,16.2%,22.9%,and 16.8%higher than Attn+PG+RL,and 23.5%,72.0%,and 24.8%higher than PRHAN.

关 键 词:拉取请求 语法改动树 相似度计算 自动摘要 开源社区 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象