代码自然性及其应用研究进展  被引量:2

Research Progress of Code Naturalness and Its Application

在线阅读下载全文

作  者:陈浙哲 鄢萌 夏鑫[3] 刘忠鑫 徐洲 雷晏 CHEN Zhe-Zhe;YAN Meng;XIA Xin;LIU Zhong-Xin;XU Zhou;LEI Yan(Key Laboratory of Dependable Service Computing in Cyber Physical Society(Chongqing University),Ministry of Education,Chongqing 400044,China;School of Big Data and Software Engineering,Chongqing University,Chongqing 401331,China;Faculty of Information Technology,Monash University,Melbourne,VIC 3800,Australia;College of Computer Science and Technology,Zhejiang University,Hangzhou 310007,China)

机构地区:[1]信息物理社会可信服务计算教育部重点实验室(重庆大学),重庆400044 [2]重庆大学大数据与软件学院,重庆401331 [3]Faculty of Information Technology,Monash University,Melbourne,VIC 3800,Australia [4]浙江大学计算机科学与技术学院,浙江杭州310007

出  处:《软件学报》2022年第8期3015-3034,共20页Journal of Software

基  金:国家自然科学基金(62002034);中央高校基本科研业务费(2020CDCGRJ072,2020CDJQYA021,2021CDJKYJH032);国防基础科研计划(WDZC20205500308);中国博士后基金(2020M673137);重庆市自然科学基金(cstc2020jcyj-bshX0114)。

摘  要:代码自然性(code naturalness)研究是自然语言处理领域和软件工程领域共同的研究热点之一,旨在通过构建基于自然语言处理技术的代码自然性模型,以解决各种软件工程任务.近年来,随着开源软件社区中源代码和数据规模的不断扩大,越来越多的研究人员注重钻研源代码中蕴藏的信息,并且取得了一系列研究成果.但与此同时,代码自然性研究在代码语料库构建、模型构建和任务应用等环节面临许多挑战.鉴于此,从代码自然性技术的代码语料库构建、模型构建和任务应用等方面对近年来代码自然性研究及应用进展进行梳理和总结.主要内容包括:(1)介绍了代码自然性的基本概念及其研究概况;(2)归纳目前代码自然性研究的语料库,并对代码自然性模型建模方法进行分类与总结;(3)总结代码自然性模型的实验验证方法和模型评价指标;(4)总结并归类了目前代码自然性的应用现状;(5)归纳代码自然性技术的关键问题;(6)展望代码自然性技术的未来发展.The study of code naturalness is one of the common research hotspots in the field of natural language processing and software engineering,aiming to solve various software engineering tasks by building a code naturalness model based on natural language processing techniques.In recent years,as the size of source code and data in the open source software community continues to grow,more and more researchers are focusing on the information contained in the source code,and a series of research results have been achieved.While at the same time,code naturalness research faces many challenges in code corpus construction,model building,and task application.In view of this,this paper reviews and summarizes the progress of code naturalness research and application in recent years in terms of code corpus construction,model construction,and task application.The main contents include:(1)Introducing the basic concept of code naturalness and its research overview;(2)The current corpus of code naturalness research is summarized,and the modeling methods for code naturalness are classified and summarized;(3)Summarizing the experimental validation methods and model evaluation metrics of code naturalness models;(4)Summarizing and categorizing the current application status of code naturalness;(5)Summarizing the key issues of code naturalness techniques;(6)Prospecting the future development of code naturalness techniques.

关 键 词:代码自然性 软件仓库挖掘 代码语言模型 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象