检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:蒋竞 刘子豪 张莉[1] 汪亮 JIANG Jing;LIU Zi-Hao;ZHANG Li;WANG Liang(State Key Laboratory of Complex Critical Software Environment(Beihang University),Beijing 100191,China;State Key Laboratory for Novel Software Technology(Nanjing University),Nanjing 210023,China)
机构地区:[1]复杂关键软件环境全国重点实验室(北京航空航天大学),北京100191 [2]计算机软件新技术国家重点实验室(南京大学),江苏南京210023
出 处:《软件学报》2024年第11期5065-5082,共18页Journal of Software
基 金:科技创新2030—“新一代人工智能”重大项目(2021ZD0112901);国家自然科学基金(62177003,62172203)。
摘 要:随着开源人工智能系统规模的扩大,软件的开发与维护也变得困难.GitHub是开源社区最重要的开源项目托管平台之一,通过GitHub提供的拉取请求系统,开发者可以方便地参与到开源项目的开发.拉取请求的描述可以帮助项目核心团队理解拉取请求的内容和开发者的意图,促进拉取请求被接受.当前,存在可观比例的开发者没有为拉取请求提供描述,既增加了核心团队的工作负担,也不利于项目日后的维护工作.提出一种自动为拉取请求生成描述的方法PRSim.所提方法提取拉取请求包含的提交说明、注释更新和代码改动等特征,建立语法改动树,使用树结构自编码器编码以检索代码改动相似的其他拉取请求,参照相似拉取请求的描述,使用编码器-解码器网络概括提交说明和注释更新,生成新拉取请求的描述.实验结果表明,PRSim的生成效果在Rouge-1、Rouge-2和Rouge-L这3个指标的F1分数上分别达到36.47%、27.69%和35.37%,与现有方法LeadCM相比分别提升了34.3%、75.2%和55.3%,与方法Attn+PG+RL相比分别提升了16.2%、22.9%和16.8%,与方法PRHAN相比分别提升了23.5%、72.0%和24.8%.As the scale of open-source artificial intelligence(AI)systems expands,software development and maintenance become difficult.GitHub is one of the most important hosting platforms for open-source projects in the open-source community.Developers can easily participate in the development of open-source projects through pull request systems provided by GitHub.The description of pull requests can help the core teams of the project understand the content of the pull requests and the intention of the developers and promote the acceptance of the pull request.At present,a considerable proportion of developers do not provide a description for the pull request,which not only increases the workload of the core team but also is not conducive to the maintenance of the project in the future.This study proposes a method named PRSim to automatically generate descriptions for pull requests.This method extracts features including commit messages,comment updates,and code changes from pull requests,builds a syntax modification tree,and uses a tree-structured autoencoder to find other pull requests with similar code changes.Then,with the help of the description of a similar pull request,it summarizes commit messages and comment updates through an encoder-decoder network to generate the description of a new pull request.The experimental results show that the generation effect of PRSim reaches 36.47%,27.69%,and 35.37%in terms of the F1 score of metrics Rouge-1,Rouge-2,and Rouge-L,respectively,which is 34.3%,75.2%,and 55.3%higher than LeadCM,16.2%,22.9%,and 16.8%higher than Attn+PG+RL,and 23.5%,72.0%,and 24.8%higher than PRHAN.
关 键 词:拉取请求 语法改动树 相似度计算 自动摘要 开源社区
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.16.56.30