检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:刘枭 王晓国[1] LIU Xiao;WANG Xiaoguo(College of Electronics and Information Engineering,Tongji University,Shanghai 201800,China)
机构地区:[1]同济大学电子与信息工程学院,上海201800
出 处:《计算机应用》2019年第4期1214-1219,共6页journal of Computer Applications
摘 要:目前银行对电信诈骗的标记数据积累少,人工标记数据的代价大,导致电信诈骗检测的有监督学习方法可使用的标记数据不足。针对这个问题,提出一种基于密集子图的无监督学习方法用于电信诈骗的检测。首先,通过在账户-资源(IP地址和MAC地址统称为资源)网络搜索可疑度较高的子图来识别欺诈账户;然后,设计了一种符合电信诈骗特性的子图可疑度量;最后,提出一种磁盘驻留、线性内存消耗且有理论保障的可疑子图搜索算法。在两组模拟数据集上,所提方法的F1-score分别达到0.921和0.861,高于CrossSpot、fBox和EvilCohort算法,与M-Zoom算法的0.899和0.898相近,但是所提方法的平均运行时间和内存消耗峰值均小于M-Zoom算法;在真实数据集上,所提方法的F1-score达到0.550,高于fBox和EvilCohort算法,与M-Zoom算法的0.529相近。实验结果表明,所提方法能较好地应用于现阶段的银行反电信诈骗业务,且非常适合于实际应用中的大规模数据集。Lack of labeled data accumulated for telecommunication fraud in the bank and high cost of manually labeling cause the insufficiency of labeled data that can be used in supervised learning methods for telecommunication fraud detection.To solve this problem,an unsupervised learning method based on dense subgraph was proposed to detect telecommunication fraud.Firstly,subgraphs with high anomaly degree in the network of accounts and resources(IP addresses and MAC addresses)were searched to identify fraud accounts.Then,a subgraph anomaly degree metric satisfying the features of telecommunication fraud was designed.Finally,a suspicious subgraph searching algorithm with resident disk,efficient memory and theory guarantee was proposed.On two synthetic datasets,the F1-scores of the proposed method are 0.921 and 0.861,which are higher than those of CrossSpot,fBox and EvilCohort algorithms while very close to those of M-Zoom algorithm(0.899 and 0.898),but the average running time and memory consumption peak of the proposed method are less than those of M-Zoom algorithm.On real-world dataset,F1-score of the proposed method is 0.550,which is higher than that of fBox and EvilCohort while very close to that of M-Zoom algorithm(0.529).Theoretical analysis and simulation results show that the proposed method can be applied to telecommunication fraud detection in the bank effectively,and is suitable for big datasets in practice.
关 键 词:电信诈骗 无监督学习 欺诈检测 密集子图 贪心算法
分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.28