一种基于协同训练的Android恶意代码检测方法  被引量:2

An Android Malicious Code Detection Method Based on Cooperative Training

在线阅读下载全文

作  者:王全民 张帅帅 杨晶 WANG Quan-min;ZHANG Shuai-shuai;YANG Jing(Department of Informatics,Beijing University of Technology,Beijing 100124,China)

机构地区:[1]北京工业大学信息学部,北京100124

出  处:《计算机技术与发展》2019年第1期135-139,共5页Computer Technology and Development

基  金:国家自然科学基金(61272500)

摘  要:对于传统的恶意程序检测方法,将机器学习算法应用在未知恶意程序的检测方法进行研究。使用单一特征的机器学习算法无法充分发挥其数据处理能力,检测效果一般。使用两视图协同训练,对于一个未知样本两个分类器预测结果相反时处理不佳。因此,在机器学习的基础上,采用一种三视图协同训练算法,三个分类器对未知样本预测有分歧时,基于"少数服从多数"的思想进行"投票"决定,具有比较理想的效果。该方法对APK软件进行逆向分析和特征提取,选取权限申请特征、API调用序列特征和Op Code特征三个非重叠子视图,针对每个子视图甄选最优算法分别生成分类器。在此基础上,采用Co-training算法思想,对三个分类器协同训练,实现了在已知样本较少的情况下,三个单独分类器检测性能的同步提升。从安卓市场下载各类良性样本4 600个,从恶意软件样本分享网站Virus Share下载最新恶意样本4 360个,按照已标记样本数量从30到120个分为10组实验,对约1 800个样本进行分类测试,实验结果表明该检测方法具有更优的效果。For the traditional detection method of malicious program,the machine learning algorithm is applied to the detection method of unknown malware.The machine learning algorithm with a single feature cannot give full play to its data processing ability,and the detection effect is general.The two view collaborative training is not well for two classifiers with unknown samples when the prediction results are opposite.Therefore,based on machine learning,we adopt a collaborative training algorithm based on three views.When three classifiers are divided into unknown samples,voting is decided based on the idea of“majority obeys the majority”.This method carries out reverse analysis and feature extraction for APK software.It selects three non-overlapping sub-views of permission application features,API calling sequence feature and OpCode feature,and generates classifiers for each sub view to select the best algorithm.Based on that,the Co-training algorithm is used to train three classifiers and achieve synchronous performance improvement of three individual classifiers under less known samples.We download more than 4 600 benign samples from the Android Market,and more than 4 360 latest malware samples from VirusShare,a malware samples sharing site.According to the number of labeled samples from 30 to 120,10 groups of experiments are conducted and about 1 800 samples are classified.The experiment shows that the detection method has a better effect.

关 键 词:机器学习 CO-TRAINING 三视图 投票 分类器 

分 类 号:TP301[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象