检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:陈萍[1] 周礼亮 张卫丰[3] CHEN Ping;ZHOU Liliang;ZHANG Weifeng(Library,Nanjing Medical University,Nanjing 211166,China;China Electronics Technology Group Corporation Tenth Research Institute,Chengdu 610000,China;School of Computer Science,Nanjing University of Posts and Telecommunications,Nanjing 210023,China)
机构地区:[1]南京医科大学图书馆,江苏南京211166 [2]中国电子科技集团公司第十研究所,四川成都610000 [3]南京邮电大学计算机学院,江苏南京210023
出 处:《软件导刊》2025年第4期89-92,共4页Software Guide
基 金:国家自然科学基金面上项目(62272214);南京市国际合作项目(202201010,202401006)。
摘 要:Pull Request是GitHub中代码贡献的一种重要方法,当开发人员希望将其代码更改从本地机器合并到存储项目中所有源代码的主存储库时,需要提交Pull Request。基于logistic回归模型对Pull Request被拒绝情况进行预测实验,预测实验中考虑的输入特征为通过数据挖掘关联规则找出的影响Pull Request被拒绝的相关因素,主要包括修改变更的代码特性、Pull Request描述的文本特性、开发人员以前行为的贡献者特性以及Pull Request过程的交互等。实验评估了包含12个开源项目中140 155个Pull Request的有效性。结果表明,基于logistic回归模型的预测结果准确率为0.84,召回率为0.99,F1分数为0.91,相较基线方法均有一定提高。分析预测实验结果表明,通过数据挖掘中关联规则找出的影响因素对Pull Request合并结果具有足够的影响力,可以帮助开发人员将精力集中在主要因素上,或分配更多资源来克服关键问题,有利于避免开发人员提交的Pull Request被拒绝,减少项目成本和时间。Pull Request is an important method of code contribution in GitHub.When developers want to merge their code changes from the lo‐cal machine into the main repository that stores all source code in the project,they need to submit a Pull Request.Based on the logistic regres‐sion model,a prediction experiment was conducted on the rejection of Pull Requests.The input features considered in the prediction experi‐ment were the relevant factors that affect the rejection of Pull Requests identified through data mining association rules,mainly including the code characteristics of modified changes,the text characteristics of Pull Request descriptions,the contributor characteristics of developers'previous behaviors,and the interaction of the Pull Request process.The experiment evaluated the effectiveness of 140 and 155 Pull Requests from 12 open source projects.The results showed that the prediction accuracy based on the logistic regression model was 0.84,the recall rate was 0.99,and the F1 score was 0.91,which showed some improvement compared to the baseline method.The analysis and prediction experi‐ment results show that the influencing factors identified through association rules in data mining have sufficient influence on the merge results of Pull Requests,which can help developers focus their energy on the main factors or allocate more resources to overcome key problems,and is conducive to avoiding the rejection of Pull Requests submitted by developers,reducing project costs and time.
关 键 词:Pull Request 影响因素 数据挖掘 回归模型 GitHub
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.198