基于孤立损失和深度自编码器的医保欺诈识别算法  

Medical insurance fraud identification based on isolation loss and deep autoencoder

在线阅读下载全文

作  者:柳叶 王亚楠 候文慧 刘慧[1] 王坚强[1] LIU Ye;WANG Yanan;HOU Wenhui;LIU Hui;WANG Jianqiang(School of Business,Central South University,Changsha 410083,China)

机构地区:[1]中南大学商学院,长沙410083

出  处:《系统工程理论与实践》2024年第11期3700-3715,共16页Systems Engineering-Theory & Practice

基  金:湖南省社会科学成果评审委员会重大课题(XSP2023ZDA002);中南大学自主探索创新项目(2024ZZTS0343)。

摘  要:针对医保欺诈识别中欺诈样本与正常样本之间的高相似性、区分度不高问题以及边缘正常样本的迷惑性问题,本文提出了基于孤立损失(isolation loss)和深度自编码器(deep autoencoder)的医保欺诈识别算法(ISDAE).该算法针对边缘欺诈样本和稀疏欺诈样本的易隔离性,提出了样本的孤立度度量,旨从特征分布角度量化分析两类样本的差异.在此基础上,利用DAE对医保线性和非线性特征的挖掘能力,并综合考虑边缘正常样本对模型训练的干扰,在潜在特征空间中定义了孤立损失以实现中心正常样本的聚集和边缘正常样本的分离,从而增大欺诈样本和正常样本的差异;然后,通过集成孤立度值和重构误差来评估样本的欺诈程度,提高模型的欺诈识别性能.最后在天池医保数据集上对所提算法的性能进行了验证,结果表明本文所提ISDAE算法的整体欺诈识别能力优于对比方法,且其性能表现更加稳定.Aiming at the problem of high similarity and low degree of discrimination between fraudulent samples and normal samples and the confusion of marginal normal samples in med-ical insurance fraud identification,this paper proposes a medical insurance fraud identification algorithm based on isolation loss and deep autoencoder(ISDAE).Aiming at the easy isolation of marginal fraud samples and sparse fraud samples,the algorithm proposes a sample isolation measure to quantitatively analyze the differences between the two types of samples from the perspective of feature distribution.On the basis,using DAE’s ability to mine linear and non-linear features of medical insurance and considering the interference of margin normal samples on model training,an isolation loss is defined in the latent space to achieve the aggregation of center normal samples and the separation of edge normal samples,thereby increasing the differ-ence between fraudulent samples and normal samples.To further improve the fraud detection performance of the model,the fraud degree of samples is evaluated by integrating the isolation value and the reconstruction error.Finally,the performance of the proposed algorithm is verified on the Tianchi medical insurance dataset.The results show that the overall fraud identification performance of the proposed ISDAE algorithm is better than the comparative methods,and its performance is more stable.

关 键 词:医保欺诈识别 孤立损失 深度自编码器 无监督学习 

分 类 号:R197.1[医药卫生—卫生事业管理] TP18[医药卫生—公共卫生与预防医学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象