基于聚类离散化的Dep-Miner函数依赖发现方法  

Clustering discretization based Dep-Miner for functional dependency discovery

在线阅读下载全文

作  者:仓敏 王静怡 吴霜 翟晓萌 程曦 诸德律 Cang Min;Wang Jingyi;Wu Shuang;Zhai Xiaomeng;Cheng Xi;Zhu Delv(Economic and Technological Research Institute of State Grid Jiangsu Electric Power Co.,Ltd.,Nanjing 210008,China)

机构地区:[1]国网江苏省电力有限公司经济技术研究院,江苏南京210008

出  处:《南京理工大学学报》2023年第3期318-329,共12页Journal of Nanjing University of Science and Technology

摘  要:针对已有函数依赖发现方法直接应用于连续型数据时,易导致依赖关系挖掘失败的问题,该文基于已有Dep-Miner方法,提出基于等间隔离散化的Dep-Miner(ED-Dep-Miner)和基于聚类离散化的Dep-Miner(CD-Dep-Miner)函数依赖发现方法。通过数据离散化,将指标的连续型数据合理地转变为类别数据。实现基于类别数据的函数依赖发现,提升函数依赖发现能力。同时,对Dep-Miner中的部分定理给出了基于反证法和枚举法的通俗化证明。该文将提出的ED-Dep-Miner和CD-Dep-Miner与不带有任何离散化操作的原始Tane和Dep-Miner进行了实验对比。实验结果表明,该文的ED-Dep-Miner和CD-Dep-Miner方法将原始连续型数据转化为离散型分类,挖掘出了更多潜在的函数依赖关系。同时,CD-Dep-Miner的性能要优于ED-Dep-Miner,解决了等间隔离散化存在的边界值问题。Aiming at the problem that the existing functional dependency discovery methods are easy to lead to the failure of dependency mining when they are directly applied to continuous data,equal-interval discretization based Dep-Miner(ED-Dep-Miner)and clustering discretization based Dep-Miner(CD-Dep-Miner)functional dependency discovery method are proposed here based on the existing Dep-Miner method.Through data discretization,the continuous data of each indicator is reasonably transformed into classified data.Functional dependency discovery is realized based on category data,and functional dependency discovery capability is improved.Meanwhile,for some theorems in Dep-Miner,a popular proof is provided based on proof by contradiction and enumeration method.The ED-Dep-Miner and CD-Dep-Miner proposed here and the original Tane and Dep-Miner without any discretization operation conduct an experimental comparison.The experimental results show that the ED-Dep-Miner and CD-Dep-Miner here transform the original continuous data into discrete classification,and mine more potential functional dependency.At the same time,the performance of CD-Dep-Miner is better than that of ED-Dep-Miner,which solves the boundary value problem of equal-interval discretization.

关 键 词:聚类 离散化 函数依赖发现 等间隔离散化 类别数据 反证法 枚举法 边界值问题 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象