基于马尔科夫毯的近似函数依赖挖掘算法  被引量:1

Approximate functional dependence discovering algorithm based on Markov blanket

在线阅读下载全文

作  者:夏秀峰 刘朝辉 张安珍 XIA Xiufeng;LIU Zhaohui;ZHANG Anzhen(College of Computer Science,Shenyang Aerospace University,Shenyang 110136,China)

机构地区:[1]沈阳航空航天大学计算机学院,沈阳110136

出  处:《沈阳航空航天大学学报》2023年第4期8-18,共11页Journal of Shenyang Aerospace University

基  金:国家自然科学基金(项目编号:62102271)。

摘  要:近似函数依赖挖掘方法通过放宽函数依赖成立条件,允许一定比例的违反,保证原本成立的函数依赖在噪声数据中仍然可以被挖掘出来。然而,现有的发现算法在放宽函数依赖成立条件之后,容易挖掘出大量左部属性数量较多的虚假函数依赖,导致挖掘结果的准确率显著降低。为了解决这一问题,提出基于马尔科夫毯的近似函数依赖挖掘算法,利用马尔科夫毯剪枝左部属性搜索空间,缩小决定项的候选集合,并通过向下泛化算法减少了误差的计算次数,同时降低了复杂度。在保证不丢失真实函数依赖的前提下,避免了近似函数依赖过拟合,从而提高了挖掘结果的准确率。实验结果表明,该方法在真实数据集和合成数据集上的准确率优于现有的近似函数依赖挖掘方法。The approximate functional dependency discovery method allows a certain proportion of violations by relaxing the conditions for the existence of functional dependency to ensure that the original functional dependency can still be mined in noise data.However,existing discovery algorithms were easy to mine a large number of false functional dependency with a large number of left attributes after relaxing the conditions for functional dependency,resulting in a significant reduction in the accuracy of discovery results.In order to solve this problem,an approximate functional dependency discovery algorithm based on Markov blanket,which uses Markov blanket to prune the left attribute search space and reduce the candidate set of decision items was proposed.And through the downward generalization algorithm,the number of error calculations was reduced and the complexity was reduced.In this way,on the premise of not losing the real functional dependency,overfitting of approximate functional dependency was avoided,thus improving the accuracy of discovery results.Experimental results show that the accuracy of the proposed method in real data sets and synthetic data sets is better than the existing approximate functional dependency discovery methods.

关 键 词:函数依赖 近似函数依赖挖掘 马尔科夫毯 噪声数据 采样 左部属性 

分 类 号:TP301[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象