基于行为motifs的多线程程序抄袭检测方法  

Plagiarism Detection of Multi-Threaded Programs by Mining Behavioral motifs

在线阅读下载全文

作  者:田振洲 王宁宁 王清 高聪 刘烃[3] 郑庆华[3] Tian Zhenzhou;Wang Ningning;Wang Qing;Gao Cong;Liu Ting;and Zheng Qinghua(School of Computer Science and Technology,Xi'an University of Posts and Telecommunications,Xi'an 710121;Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing(Xi'an University of Posts and Telecommunications),Xi'an 710121;School of Electronic and Information Engineering,Xi'an Jiaotong University,Xi'an 710049)

机构地区:[1]西安邮电大学计算机学院,西安710121 [2]陕西省网络数据分析与智能处理重点实验室(西安邮电大学),西安710121 [3]西安交通大学电信学院,西安710049

出  处:《计算机研究与发展》2020年第1期202-213,共12页Journal of Computer Research and Development

基  金:国家自然科学基金项目(61702414);陕西省自然科学基础研究计划项目(2018JQ6078);陕西省国际科技合作计划项目(2018KW-049,2019KW-008);陕西省重点研发计划项目(2019ZDLGY07-08)~~

摘  要:软件动态胎记技术是实现混淆对抗的软件抄袭检测的有效手段之一.然而,多线程程序中线程交织的不确定性对其造成了不可忽视的影响;极端情况下,传统动态胎记技术甚至会判定同一个程序与其自身不存在抄袭关系.对此,提出从多线程程序在同一输入下的多条执行轨迹中进行相似部分的识别,并从中抽象出不易受线程交织影响的行为motifs来实现多线程程序的抄袭检测.该方法捕捉程序的动态执行轨迹,经过轨迹修剪、gram匹配以及扩展和抽象,从中提取motifs胎记建模多线程程序的行为;最终,通过衡量motifs胎记的相似性实现程序间潜在抄袭的判定.在一个包含234个不同版本多线程程序的公共数据集上开展的实验表明,motifs胎记是一种可靠的线程感知胎记方法,不仅可有效对抗当下主流的代码混淆技术,相比现有的2种多线程程序抄袭检测方法TreSB(thread-related system call birthmark)和TOB(thread-oblivious birthmark),也体现出更优秀的检测能力.Thread scheduling nondeterminism in multi-threaded programs severely perturbs dynamic birthmarking techniques,which serves as one of the most promising approaches in solving obfuscation-resilient software plagiarism detection problem.In the extreme cases,a traditional dynamic birthmarking method even fails to detect plagiarism between a multi-threaded program and itself.To solve the problem,a novel method for plagiarism detection of multi-threaded programs is proposed by abstracting behavioral motifs,which are less susceptible to thread scheduling,from the similar parts of same-input-driven execution traces of a multi-threaded program.Specifically,the method captures execution traces during program runs,from which a series of motifs are inferred by performing trace pruning,gram-based matching,as well as bidirectional extending and abstraction,to depict the behavior of the multi-threaded program.Plagiarism between two programs is then determined by measuring the similarity of their motifs and a threshold.An empirical study conducted on a public benchmark consisting of totally 234 different versions of multi-threaded programs shows that the proposed method is resilient against most state-of-the-art code obfuscation techniques and outperforms two existing dynamic birthmarking methods TreSB(thread-related system call birthmark)and TOB(thread-oblivious birthmark)framework which also target multi-threaded program plagiarism detection.It indicates that the mined behavioral motifs form a new kind of robust thread-oblivious birthmark.

关 键 词:软件抄袭 多线程程序抄袭检测 动态胎记 线程感知胎记 行为motifs 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象