基于概念传播的软件项目代码注释生成方法  被引量:1

Code Comment Generation Based on Concept Propagation for Software Projects

在线阅读下载全文

作  者:潘兴禄 刘陈晓 王敏 邹艳珍[1,2] 王涛 谢冰[1,2] PAN Xing-Lu;LIU Chen-Xiao;WANG Min;ZOU Yan-Zhen;WANG Tao;XIE Bing(Key Laboratory of High Confidence Software Technologies(Peking University),Ministry of Education,Beijing 100871,China;School of Computer Science,Peking University,Beijing 100871,China;College of Computer Science and Technology,National University of Defense Technology,Changsha 410073,China)

机构地区:[1]高可信软件技术教育部重点实验室(北京大学),北京100871 [2]北京大学计算机学院,北京100871 [3]国防科技大学计算机学院,湖南长沙410073

出  处:《软件学报》2023年第9期4114-4131,共18页Journal of Software

基  金:国家自然科学基金(61972006)。

摘  要:软件代码注释生成是软件工程领域近期研究的一个重要问题.目前很多研究工作已经在包含大量<代码片段,注释语句>对的开源数据集上取得了较好效果.但在企业应用中,待注释的代码往往是一个软件项目库,其必须首先决策在哪些代码行上生成注释更好,而且待注释的代码片段大小、粒度各不相同,需要研究提出一种注释决策和生成一体化的、抗噪音的代码注释生成方法.针对这个问题,提出一个面向软件项目的代码自动注释生成方法CoComment.所提方法能够自动抽取软件项目文档中的领域基本概念,并基于代码解析与文本匹配进行概念传播和扩展.在此基础上,通过定位概念相关的代码行/段进行自动注释决策,最终利用模板融合概念和上下文生成具有高可读性的自然语言代码注释.目前CoComment已经在3个企业软件项目、超过4.6万条人工代码注释数据上进行了对比试验.结果表明,所提方法不仅能够有效地进行代码注释决策,其注释内容与现有方法相比也能够提供更多有益于理解代码的信息,从而为软件项目代码的注释决策和注释生成问题提供了一种一体化的解决方案.Comment generation for software codes has been an important research task in the field of software engineering in the past few years.Several research efforts have achieved impressive results on the open-source datasets that contain copious<code snippet,comment>pairs.In the practice of software enterprises,however,the codes to be commented usually belong to a software project library,and it should be decided first on which code lines the comment generation can achieve better performance;moreover,the code snippets to be commented have different lengths and granularity.Thus,a code comment generation method is required,which can integrate commenting decisions and comment generation and is resistant to noise.To this end,CoComment,a software project-oriented code comment generation approach,is proposed in this study.This approach can automatically extract domain-specific basic concepts from software project documents and then uses code parsing and text matching to propagate and expand these concepts.On this basis,automatic code commenting decisions are made by locating code lines or segments related to these concepts,and corresponding natural language comments with high readability are generated upon the fusion of concepts and contexts with templates.Comparative experiments are conducted on three enterprise software projects containing more than 46000 manually annotated code comments.The experimental results demonstrate the proposed approach can effectively make code commenting decisions and generate more helpful code comments compared with existing methods,which provides an integrated solution to code commenting decisions and comment generation for software projects.

关 键 词:代码注释 软件项目 注释决策 注释生成 概念传播 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象