基于孤立森林算法的企业分布式财务不良数据检测研究  被引量:1

Detection of Distributed Financial Bad Data of Enterprises Based on Isolated Forest Algorithm

在线阅读下载全文

作  者:李自霞 周波 LI Zixia;ZHOU Bo(Finance Department,Xuancheng Institute of Vocational Technology,Xuancheng 242000,China;School of Computer Science and Information Technology,Hefei University of Technology,Xuancheng 242000,China)

机构地区:[1]宣城职业技术学院财务处,安徽宣城242000 [2]合肥工业大学计算机与信息学院,安徽宣城242000

出  处:《湖北文理学院学报》2024年第8期22-27,共6页Journal of Hubei University of Arts and Science

基  金:安徽省质量工程项目(2022cxtd171)。

摘  要:为了实现企业分布式财务不良数据的高效、精准检测,为企业财务安全决策提供重要数据保障,基于孤立森林算法,对企业分布式财务不良数据检测开展研究。通过分析企业分布式财务元数据管理体系,结合元数据仓库中的元数据目录映射实际企业分布式财务数据列表,提取企业分布式实际财务数据;从噪声干扰处理、数据缺失填补等角度,结合Z-score与中位数插值方法对数据预处理,以保证企业分布式财务数据质量;根据数据方差、标准差、偏度、峰度等统计量,计算完成预处理后数据中不良数据的分布特征,并基于孤立森林算法、融合孤立树的二叉树结构,最终实现企业分布式财务不良数据的高效、精准检测。实验结果表明:利用本文设计方法对数据集中预处理后,能够有效解决数据的异常空间分布状态、填补缺数部分数据,修复受噪声干扰产生的畸变状态;检测消耗时间最高值为5.4s,检测精准度最高值为0.93,检测效率与检测精准度具有比较优势。In order to realize the efficient and accurate detection of enterprise distributed financial bad data,and provide enterprises with important data guarantee for financial security decision-making,based on the isolated forest algorithm,the detection of enterprise distributed financial bad data is studied in depth.By analyzing the enterprise distributed financial metadata management system,combining the metadata catalog in the metadata warehouse to map the actual enterprise distributed financial data list,and extracting the actual enterprise distributed financial data;preprocessing the data from the perspectives of noise interference processing,data missing filling,combining the Z-score method with the median interpolation method,to ensure the quality of the enterprise distributed financial data;according to the data variance,standard deviation,skewness,kurtosis and other statistics,calculate the distribution characteristics of bad data in the completed preprocessed data,and based on the isolated forest algorithm,integrate the binary tree structure of the isolated tree to realize the efficient and accurate detection of the bad data of enterprise distributed finance.The experimental results show that:after using the design method to preprocess the data in the data set,it can effectively solve the abnormal spatial distribution state of the data,effectively fill in the missing part of the data,and repair the aberration state generated by the influence of the collection noise interference,which has a good practical application effect.And the highest value of detection consumption time is 5.4s,and the highest value of detection accuracy is 0.93,which has certain advantages in detection efficiency and detection accuracy.

关 键 词:孤立森林算法 孤立树 分布式财务数据 数据检测 

分 类 号:TP183[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象