检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张峰 韦友良 秦玉成 ZHANG Feng;WEI Youliang;QIN Yucheng(College of Computer Science and Engineering,Shandong University of Science and Technology,Qingdao 266590,China)
机构地区:[1]山东科技大学计算机科学与工程学院,山东青岛266590
出 处:《小型微型计算机系统》2025年第1期249-256,共8页Journal of Chinese Computer Systems
基 金:教育部人文社会科学研究规划基金项目(23YJAZH192)资助;国家自然科学基金项目(52374221)资助;山东省自然科学基金项目(ZR2021QG038)资助;山东省泰山学者特聘专家支持项目(ts20190936)资助;山东科技大学青年教师教学拔尖人才培养项目(BJ20200505)资助。
摘 要:跨语言代码抄袭检测在软件知识产权保护和计算机程序设计类课程教学等领域有广泛的应用.然而,不同编程语言的语法差异降低了代码之间的相似度,导致抄袭检测的准确率较低.因此,本文提出一种基于程序流程图和图注意力网络的跨语言代码抄袭检测方法.首先,将代码转换为程序流程图,并利用图注意力网络提取程序流程图的特征作为代码的表示;其次,采用交叉匹配方法逐行对比代码的表示,以获得代码的相似特征向量;最后,拼接待检测代码的相似特征向量,并通过全连接神经网络计算抄袭的概率.实验结果表明,与现有的跨语言代码抄袭检测方法相比,本文提出的方法在查准率、查全率和F1值方面均有提高.其中,与基于属性计数的CLCDSA方法、基于抽象语法树的ASTLearner方法相比,F1值分别提高了11%和16%.Cross language code plagiarism detection has been widely used in the fields such as software intellectual property protection and computer programming teaching.However,the syntactic differences between different programming languages reduce the similarity between codes,resulting in lower accuracy of plagiarism detection.Therefore,this paper proposes a cross language code plagiarism detection approach based on program flowchart and graph attention network.First,source code is converted into a program flowchart and its features are extracted as the representation of the code using graph attention network.Second,the representation of the code is compared line by line using cross-matching method to obtain the similarity feature vectors of the code.Finally,the similar feature vectors of the source code to be detected are combined,and the probability of plagiarism is calculated using a fully connected neural network.The experimental results show that compared with existing cross language code plagiarism detection approaches,the proposed approach in this paper has improved accuracy,recall,and F1 value.Compared with the CLCDSA based on attribute counting andASTlearner based on abstract syntax trees,the F1 values have been increased by 11%and 16%,respectively.
关 键 词:代码抄袭检测 跨编程语言 程序流程图 图注意力网络
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.147